From patchwork Fri Jan 15 11:30:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426895 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=Ags4184f; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=ZpjDSNTF; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJsY4nhQz9sRf for ; Fri, 15 Jan 2021 22:31:12 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CCB883973064; Fri, 15 Jan 2021 11:31:09 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id 756723973049 for ; Fri, 15 Jan 2021 11:31:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 756723973049 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 0A21EAD4; Fri, 15 Jan 2021 06:31:05 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 15 Jan 2021 06:31:06 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=wvYy8HJedu+7a 4xxpYBE7FD4lxxkgRzG/8QBUTsIegs=; b=Ags4184fsy/VEmvxCpf3nsdmYcGjo inIAFKVcRuyRQ9pInoYRu52fnODg8tHARe9hvB4hlgXbgoyPb66D/22p6pQy64DD woGisRyXY01vDAGwOs1fyjFWpEgS1A/dr9QgkHgk+VXtC+fkh1BnAWRViAwf0Key IZueVl/DeHEw5pMWYJ9xHzmbnA5246B/OOKbmftYC6axsBUt1UMjzxLT98/niYcs 8TEPfMkScNlvAYF3jDW92+wcQfLWv0LHfhbt3y875QGZIfUXfi1ET26hNPoBob3v pacflWAiqOW7k5buP6FGYkISdc5gW5q46o9rVoG98pBcM1a4xIuWXQiWA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=wvYy8HJedu+7a4xxpYBE7FD4lxxkgRzG/8QBUTsIegs=; b=ZpjDSNTF K5WXiGqTi1ArGcn8kZw6yNtdSmzh8lRduQHLLZvUSEPpFJNqYDYkME+Dxg76/P0U FD587KhtOGWKZAAs2LN7RsUAhgUSepAqqMn2lN6fjGt98tvZoL61sRiarr5VsCNd 65SFx/ZQxaazVorDsSIcE7WVX4yh43BqRI5yGqslhkafwzDdkc6coHMr18zzDoxg AlV2PxsnEGlL8TWC5LW58RGtsXOWTOetUy57OGOY5m40IlRDqBNKVyK37n6KPKYy RYR0QCllHRgxlSuH2lxZ43kX4CteqQpg/e73Gu114IVnzCUoAITeUFLoePWALM/N 1zDCcBiukIQ56A== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddtgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpefhvefhgfdtgfffieekhefhudevue eludffueeuffdviedtveehvdejgeelleevveenucffohhmrghinheplhhisgdufhhunhgt shdrshgsnecukfhppeejuddrfeeirddutddtrddvvddtnecuvehluhhsthgvrhfuihiivg eptdenucfrrghrrghmpehmrghilhhfrhhomhepghhnuhesuggrnhhivghlvghnghgvlhdr tghomh X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id 415AE108005F; Fri, 15 Jan 2021 06:31:05 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBV4OQ023694; Fri, 15 Jan 2021 03:31:04 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 01/33] Add and restructure function declaration macros Date: Fri, 15 Jan 2021 03:30:29 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-13.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_STOCKGEN, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Most of these changes support subsequent patches in this series. Particularly, the FUNC_START macro becomes part of a new macro chain: * FUNC_ENTRY Common global symbol directives * FUNC_START_SECTION FUNC_ENTRY to start a new
* FUNC_START FUNC_START_SECTION <".text"> The effective definition of FUNC_START is unchanged from the previous version of lib1funcs. See code comments for detailed usage. The new names FUNC_ENTRY and FUNC_START_SECTION were chosen specifically to complement the existing FUNC_START name. Alternate name patterns are possible (such as {FUNC_SYMBOL, FUNC_START_SECTION, FUNC_START_TEXT}), but any change to FUNC_START would require refactoring much of libgcc. Additionally, a parallel chain of new macros supports weak functions: * WEAK_ENTRY * WEAK_START_SECTION * WEAK_START * WEAK_ALIAS Moving the CFI_* macros earlier in the file scope will increase their scope for use in additional functions. gcc/libgcc/ChangeLog: 2021-01-14 Daniel Engel * config/arm/lib1funcs.S: (LLSYM): New macro prefix ".L" for strippable local symbols. (CFI_START_FUNCTION, CFI_END_FUNCTION): Moved earlier in the file. (FUNC_ENTRY): New macro for symbols with no ".section" directive. (WEAK_ENTRY): New macro FUNC_ENTRY + ".weak". (FUNC_START_SECTION): New macro FUNC_ENTRY with
argument. (WEAK_START_SECTION): New macro FUNC_START_SECTION + ".weak". (FUNC_START): Redefined in terms of FUNC_START_SECTION <".text">. (WEAK_START): New macro FUNC_START + ".weak". (WEAK_ALIAS): New macro FUNC_ALIAS + ".weak". (FUNC_END): Moved after FUNC_START macro group. (THUMB_FUNC_START): Moved near the other *FUNC* macros. (THUMB_SYNTAX, ARM_SYM_START, SYM_END): Deleted unused macros. --- libgcc/config/arm/lib1funcs.S | 109 +++++++++++++++++++++------------- 1 file changed, 69 insertions(+), 40 deletions(-) diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S index c2fcfc503ec..f14662d7e15 100644 --- a/libgcc/config/arm/lib1funcs.S +++ b/libgcc/config/arm/lib1funcs.S @@ -69,11 +69,13 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see #define TYPE(x) .type SYM(x),function #define SIZE(x) .size SYM(x), . - SYM(x) #define LSYM(x) .x +#define LLSYM(x) .L##x #else #define __PLT__ #define TYPE(x) #define SIZE(x) #define LSYM(x) x +#define LLSYM(x) x #endif /* Function end macros. Variants for interworking. */ @@ -182,6 +184,16 @@ LSYM(Lend_fde): #endif .endm +.macro CFI_START_FUNCTION + .cfi_startproc + .cfi_remember_state +.endm + +.macro CFI_END_FUNCTION + .cfi_restore_state + .cfi_endproc +.endm + /* Don't pass dirn, it's there just to get token pasting right. */ .macro RETLDM regs=, cond=, unwind=, dirn=ia @@ -324,10 +336,6 @@ LSYM(Lend_fde): .endm #endif -.macro FUNC_END name - SIZE (__\name) -.endm - .macro DIV_FUNC_END name signed cfi_start __\name, LSYM(Lend_div0) LSYM(Ldiv0): @@ -340,48 +348,76 @@ LSYM(Ldiv0): FUNC_END \name .endm -.macro THUMB_FUNC_START name - .globl SYM (\name) - TYPE (\name) - .thumb_func -SYM (\name): -.endm - /* Function start macros. Variants for ARM and Thumb. */ #ifdef __thumb__ #define THUMB_FUNC .thumb_func #define THUMB_CODE .force_thumb -# if defined(__thumb2__) -#define THUMB_SYNTAX -# else -#define THUMB_SYNTAX -# endif #else #define THUMB_FUNC #define THUMB_CODE -#define THUMB_SYNTAX #endif +.macro THUMB_FUNC_START name + .globl SYM (\name) + TYPE (\name) + .thumb_func +SYM (\name): +.endm + +/* Strong global symbol, ".text" section. + The default macro for function declarations. */ .macro FUNC_START name - .text + FUNC_START_SECTION \name .text +.endm + +/* Weak global symbol, ".text" section. + Use WEAK_* macros to declare a function/object that may be discarded in by + the linker when another library or object exports the same name. + Typically, functions declared with WEAK_* macros implement a subset of + functionality provided by the overriding definition, and are discarded + when the full functionality is required. */ +.macro WEAK_START name + .weak SYM(__\name) + FUNC_START_SECTION \name .text +.endm + +/* Strong global symbol, alternate section. + Use the *_START_SECTION macros for declarations that the linker should + place in a non-defailt section (e.g. ".rodata", ".text.subsection"). */ +.macro FUNC_START_SECTION name section + .section \section,"x" + .align 0 + FUNC_ENTRY \name +.endm + +/* Weak global symbol, alternate section. */ +.macro WEAK_START_SECTION name section + .weak SYM(__\name) + FUNC_START_SECTION \name \section +.endm + +/* Strong global symbol. + Use *_ENTRY macros internal to a function/object body to declare a second + or subsequent entry point without changing the assembler state. + Because there is no alignment specification, these macros should never + replace the *_START_* macros as the first declaration in any object. */ +.macro FUNC_ENTRY name .globl SYM (__\name) TYPE (__\name) - .align 0 THUMB_CODE THUMB_FUNC - THUMB_SYNTAX SYM (__\name): .endm -.macro ARM_SYM_START name - TYPE (\name) - .align 0 -SYM (\name): +/* Weak global symbol. */ +.macro WEAK_ENTRY name + .weak SYM(__\name) + FUNC_ENTRY \name .endm -.macro SYM_END name - SIZE (\name) +.macro FUNC_END name + SIZE (__\name) .endm /* Special function that will always be coded in ARM assembly, even if @@ -447,6 +483,11 @@ SYM (__\name): #endif .endm +.macro WEAK_ALIAS new old + .weak SYM(__\new) + FUNC_ALIAS \new \old +.endm + #ifndef NOT_ISA_TARGET_32BIT .macro ARM_FUNC_ALIAS new old .globl SYM (__\new) @@ -1459,10 +1500,8 @@ LSYM(Lover12): #ifdef L_dvmd_tls #ifdef __ARM_EABI__ - WEAK aeabi_idiv0 - WEAK aeabi_ldiv0 - FUNC_START aeabi_idiv0 - FUNC_START aeabi_ldiv0 + WEAK_START aeabi_idiv0 + WEAK_START aeabi_ldiv0 RET FUNC_END aeabi_ldiv0 FUNC_END aeabi_idiv0 @@ -2170,16 +2209,6 @@ LSYM(Lchange_\register): #endif /* Arch supports thumb. */ -.macro CFI_START_FUNCTION - .cfi_startproc - .cfi_remember_state -.endm - -.macro CFI_END_FUNCTION - .cfi_restore_state - .cfi_endproc -.endm - #ifndef __symbian__ /* The condition here must match the one in gcc/config/arm/elf.h and libgcc/config/arm/t-elf. */ From patchwork Fri Jan 15 11:30:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426897 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=Wob7ln7U; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=Y423oP0n; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJsl681Dz9sSC for ; Fri, 15 Jan 2021 22:31:23 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3EB1A3973070; Fri, 15 Jan 2021 11:31:13 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id F09523973068 for ; Fri, 15 Jan 2021 11:31:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org F09523973068 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute7.internal (compute7.nyi.internal [10.202.2.47]) by mailout.west.internal (Postfix) with ESMTP id 09695B00; Fri, 15 Jan 2021 06:31:08 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute7.internal (MEProxy); Fri, 15 Jan 2021 06:31:09 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=paeaF5urVNR9X /NE8/8KVyUKc8rj0zL+HST55ie18Jw=; b=Wob7ln7UJDEWYIpb/9mnh7ZY9+y2G OTZmfiB//hBTiBSyURejq8YOSjtmYpRRQubjrWXaqZVrDSMCnwWddfg5V3Nk5uzR MV6iG9SeVNc/XaDZ05PBUmciloZSnflQmXIqhDMbe3JxFv4JOlRpOUkgj0JXrboQ QOeGWs3nx89XIJhye8iPjo+Rn5b0AHcG+lBMcISTOxaedBCElPz5l9eyacuWNUJJ ya3CgGEI1J/ophMDHmLTYmChdoq8mX2ohpFM8bEOGikkbccvWuzn0F/tl9vfAT/7 sqxFqz2308uxoDcL+dQqj/+DzNVL9KA51uPQ/SqmYCYgM0La6ej7LEcZA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=paeaF5urVNR9X/NE8/8KVyUKc8rj0zL+HST55ie18Jw=; b=Y423oP0n +Nknj+EvfOFMDumfl2L5kV4JaESkNrjZ8F96CfwsonEuWaIsinGg7cae4Cd8Y2h8 hd7cl6IVSTSavaMoyF/+XZoibZKbtkA1CjqrVEDMseAAbmK9NoN7vQUak53olp8S RqxCQjZiyJood9PsqHoD41L0ecnw0A6g6S70iFpfqjoyAHvddU4BHPbGI4Qgk96g HHgdXq4tLCTQkd1cNB1sbglye2yed1xHMFyXwcxXjMD4kYBeiahn4iAuSstSNuMC zL+Ndvmhj/9tidANTDSjcTUFSCs5pxzyM7fM56brYlelehEs6gnrV8r0+QLl6nX1 1TxXgOWHZMTqDA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddthecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpefhvefhgfdtgfffieekhefhudevue eludffueeuffdviedtveehvdejgeelleevveenucffohhmrghinheplhhisgdufhhunhgt shdrshgsnecukfhppeejuddrfeeirddutddtrddvvddtnecuvehluhhsthgvrhfuihiivg eptdenucfrrghrrghmpehmrghilhhfrhhomhepghhnuhesuggrnhhivghlvghnghgvlhdr tghomh X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id 48A87108005F; Fri, 15 Jan 2021 06:31:07 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBV6hC023697; Fri, 15 Jan 2021 03:31:06 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 02/33] Rename THUMB_FUNC_START to THUMB_FUNC_ENTRY Date: Fri, 15 Jan 2021 03:30:30 -0800 Message-Id: <2b83afdc9d09511532f8317d4e53e1e483232353.1610709584.git.gnu@danielengel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-13.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Since THUMB_FUNC_START does not insert the ".text" directive, it aligns more closely with the new FUNC_ENTRY maro and is renamed accordingly. THUMB_FUNC_START usage has been universally synonymous with the ".force_thumb" directive, so this is now folded into the definition. Usage of ".force_thumb" and ".thumb_func" is now tightly coupled throughout the "arm" subdirectory. gcc/libgcc/ChangeLog: 2021-01-14 Daniel Engel * config/arm/lib1funcs.S: (THUMB_FUNC_START): Renamed to ... (THUMB_FUNC_ENTRY): for consistency; also added ".force_thumb". (_call_via_r0): Removed redundant preceding ".force_thumb". (__gnu_thumb1_case_sqi, __gnu_thumb1_case_uqi, __gnu_thumb1_case_shi, __gnu_thumb1_case_si): Removed redundant ".force_thumb" and ".syntax". --- libgcc/config/arm/lib1funcs.S | 32 +++++++++++--------------------- 1 file changed, 11 insertions(+), 21 deletions(-) diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S index f14662d7e15..65d070d8178 100644 --- a/libgcc/config/arm/lib1funcs.S +++ b/libgcc/config/arm/lib1funcs.S @@ -358,10 +358,11 @@ LSYM(Ldiv0): #define THUMB_CODE #endif -.macro THUMB_FUNC_START name +.macro THUMB_FUNC_ENTRY name .globl SYM (\name) TYPE (\name) .thumb_func + .force_thumb SYM (\name): .endm @@ -1944,10 +1945,9 @@ ARM_FUNC_START ctzsi2 .text .align 0 - .force_thumb .macro call_via register - THUMB_FUNC_START _call_via_\register + THUMB_FUNC_ENTRY _call_via_\register bx \register nop @@ -2030,7 +2030,7 @@ _arm_return_r11: .macro interwork_with_frame frame, register, name, return .code 16 - THUMB_FUNC_START \name + THUMB_FUNC_ENTRY \name bx pc nop @@ -2047,7 +2047,7 @@ _arm_return_r11: .macro interwork register .code 16 - THUMB_FUNC_START _interwork_call_via_\register + THUMB_FUNC_ENTRY _interwork_call_via_\register bx pc nop @@ -2084,7 +2084,7 @@ LSYM(Lchange_\register): /* The LR case has to be handled a little differently... */ .code 16 - THUMB_FUNC_START _interwork_call_via_lr + THUMB_FUNC_ENTRY _interwork_call_via_lr bx pc nop @@ -2112,9 +2112,7 @@ LSYM(Lchange_\register): .text .align 0 - .force_thumb - .syntax unified - THUMB_FUNC_START __gnu_thumb1_case_sqi + THUMB_FUNC_ENTRY __gnu_thumb1_case_sqi push {r1} mov r1, lr lsrs r1, r1, #1 @@ -2131,9 +2129,7 @@ LSYM(Lchange_\register): .text .align 0 - .force_thumb - .syntax unified - THUMB_FUNC_START __gnu_thumb1_case_uqi + THUMB_FUNC_ENTRY __gnu_thumb1_case_uqi push {r1} mov r1, lr lsrs r1, r1, #1 @@ -2150,9 +2146,7 @@ LSYM(Lchange_\register): .text .align 0 - .force_thumb - .syntax unified - THUMB_FUNC_START __gnu_thumb1_case_shi + THUMB_FUNC_ENTRY __gnu_thumb1_case_shi push {r0, r1} mov r1, lr lsrs r1, r1, #1 @@ -2170,9 +2164,7 @@ LSYM(Lchange_\register): .text .align 0 - .force_thumb - .syntax unified - THUMB_FUNC_START __gnu_thumb1_case_uhi + THUMB_FUNC_ENTRY __gnu_thumb1_case_uhi push {r0, r1} mov r1, lr lsrs r1, r1, #1 @@ -2190,9 +2182,7 @@ LSYM(Lchange_\register): .text .align 0 - .force_thumb - .syntax unified - THUMB_FUNC_START __gnu_thumb1_case_si + THUMB_FUNC_ENTRY __gnu_thumb1_case_si push {r0, r1} mov r1, lr adds.n r1, r1, #2 /* Align to word. */ From patchwork Fri Jan 15 11:30:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426898 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=dnATpg7U; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=WEejzgvb; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJss3Qj4z9sSC for ; Fri, 15 Jan 2021 22:31:29 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A9B823973079; Fri, 15 Jan 2021 11:31:13 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id 4EB013973070 for ; Fri, 15 Jan 2021 11:31:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 4EB013973070 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 29FF13FA; Fri, 15 Jan 2021 06:31:10 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 15 Jan 2021 06:31:10 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=kINZvU9iQtgs4 D9Tcbe+X2S+E4DncizFn6FiXpcFLWM=; b=dnATpg7UCcTQUHp9zFLFkgGv33OwP MxBiC6bH+6cwuL0/4jxuSD5bkobeia2wv8hA6fGV0hH4rPMq79o29oRd+A6r4rhK MySnr0OxskcWWjyNktfQisJRlpWeUHoPl/5Jm1r6Sw51/XnNVcFFhvVlAYF8qCLm v1/9/2vmAassjfS8NiekZpk2LGeOrBLWODbP/5nzPuAjzO34jynm4fFwrOOZIlKK PiBKmg8aT1DThh9iSGasiGHne19RJ8C7yBBvwqNCiPFnOgFP50RN392Aq1V6jYIN ori8tvUbLATZBPDiVf+XwksAsq62opgtTcNpPh8ebYiVdEFT2XucbVljg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=kINZvU9iQtgs4D9Tcbe+X2S+E4DncizFn6FiXpcFLWM=; b=WEejzgvb k9pa88WLlSoU7DI8SpTT84FK4USqVdLGMCm2XtbyW/0FR9Qm4lj/Qtcj7fdilg6T l1Qd34FgDItHkNxnDMKR0cqak456knt7/Yx1QpD2OJA2jP2SsN0QWQ6h2BqkQX86 jVt45+qbGRhi7DMH/Zu4dtkhb66NH3TbElPGJXVts5JBJ0mJ0HTgpL1WmU/XqLM0 gYSUaUUwCr5hRrWf77+YJ7vjPzV8qGfWratriMBFr/pvuNXIJzxTjy+CP8bhwKh4 XZKVHjKZESkJ6VTAk1fAFpZRKNZLnXsaVhRtDwtgNNsERcH9J3JZTC0sO3lo/d84 knrKspcidY2yig== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddtgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpefhvefhgfdtgfffieekhefhudevue eludffueeuffdviedtveehvdejgeelleevveenucffohhmrghinheplhhisgdufhhunhgt shdrshgsnecukfhppeejuddrfeeirddutddtrddvvddtnecuvehluhhsthgvrhfuihiivg epudenucfrrghrrghmpehmrghilhhfrhhomhepghhnuhesuggrnhhivghlvghnghgvlhdr tghomh X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id 66F6A108005B; Fri, 15 Jan 2021 06:31:09 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBV8Q0023700; Fri, 15 Jan 2021 03:31:08 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 03/33] Fix syntax warnings on conditional instructions Date: Fri, 15 Jan 2021 03:30:31 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-13.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" gcc/libgcc/ChangeLog: 2021-01-14 Daniel Engel * config/arm/lib1funcs.S (RETLDM, ARM_DIV_BODY, ARM_MOD_BODY, _interwork_call_via_lr): Moved condition code after the flags update specifier "s". (ARM_FUNC_START, THUMB_LDIV0): Removed redundant ".syntax". --- libgcc/config/arm/lib1funcs.S | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S index 65d070d8178..b8693be8e4f 100644 --- a/libgcc/config/arm/lib1funcs.S +++ b/libgcc/config/arm/lib1funcs.S @@ -204,7 +204,7 @@ LSYM(Lend_fde): # if defined(__thumb2__) pop\cond {\regs, lr} # else - ldm\cond\dirn sp!, {\regs, lr} + ldm\dirn\cond sp!, {\regs, lr} # endif .endif .ifnc "\unwind", "" @@ -220,7 +220,7 @@ LSYM(Lend_fde): # if defined(__thumb2__) pop\cond {\regs, pc} # else - ldm\cond\dirn sp!, {\regs, pc} + ldm\dirn\cond sp!, {\regs, pc} # endif .endif #endif @@ -292,7 +292,6 @@ LSYM(Lend_fde): pop {r1, pc} #elif defined(__thumb2__) - .syntax unified .ifc \signed, unsigned cbz r0, 1f mov r0, #0xffffffff @@ -429,7 +428,6 @@ SYM (__\name): /* For Thumb-2 we build everything in thumb mode. */ .macro ARM_FUNC_START name FUNC_START \name - .syntax unified .endm #define EQUIV .thumb_set .macro ARM_CALL name @@ -643,7 +641,7 @@ pc .req r15 orrhs \result, \result, \curbit, lsr #3 cmp \dividend, #0 @ Early termination? do_it ne, t - movnes \curbit, \curbit, lsr #4 @ No, any more bits to do? + movsne \curbit, \curbit, lsr #4 @ No, any more bits to do? movne \divisor, \divisor, lsr #4 bne 1b @@ -745,7 +743,7 @@ pc .req r15 subhs \dividend, \dividend, \divisor, lsr #3 cmp \dividend, #1 mov \divisor, \divisor, lsr #4 - subges \order, \order, #4 + subsge \order, \order, #4 bge 1b tst \order, #3 @@ -2093,7 +2091,7 @@ LSYM(Lchange_\register): .globl .Lchange_lr .Lchange_lr: tst lr, #1 - stmeqdb r13!, {lr, pc} + stmdbeq r13!, {lr, pc} mov ip, lr adreq lr, _arm_return bx ip From patchwork Fri Jan 15 11:30:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426899 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=sMynQGK6; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=OHnA1umc; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJsy4WqMz9sSC for ; Fri, 15 Jan 2021 22:31:34 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 26693397306E; Fri, 15 Jan 2021 11:31:16 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id 4EB423973074 for ; Fri, 15 Jan 2021 11:31:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 4EB423973074 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 4F63EB7A; Fri, 15 Jan 2021 06:31:12 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 15 Jan 2021 06:31:12 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=aupHMJoxJ6EmE arAHM7ZIcJq+Pb7Q2f3o5SMKQgU3Sk=; b=sMynQGK6koMASPFAus6jjvaZTatoy SGfzcuAQQRmn+awHrf5BVGQ50V8FwwVaZtIlNegcsLcPfYgzQ9y5x0gl3xZ4Uyuw 90Ncwd2xTjdL2S3w3Q8SMKWYIVHm9nKzj9vT0/sjQuVg1mIBww/l2DwTuoL9G+Ve hWBkLxAph6LR2ie2Rk6f31MxpMHr97kmLlpaxelki3HNa26VK9woQ0Yh+EA6I7GX KVOm4BZJfh0+6CKunMovK1oXboxVWZ55Xc84R94EwFpnPp7zmWnMuJOppr70OiQB St1Yio1uEIsxvi+l2ln2FQxd3KmF4rQhIEBBucQEtdRXOsiNZnORvAQ5w== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=aupHMJoxJ6EmEarAHM7ZIcJq+Pb7Q2f3o5SMKQgU3Sk=; b=OHnA1umc j6IgVZmqbzP0dVQoMHLOAFd02yNX6ujrIn533JsLwWJ7ZeOLZlZs6h4bIenUHa0H b0RmpEKad3wDUmoIDT1VQKVQBip3fuiLHfHWZMTmgqqWUM4BlPPVT+a+uIU45Ope lQGdcwFa3IDZi8jkB8UOVXM6Shz8fg2xV56W+p1SR5W2T+gKoSvc+g/iN4W0P6H/ giVHzDhfE+5ZPEv9kcjycQvtUrAVAFTAdD4UWeYIoevtqPmQgI1HgBqxB3jF4j4R IqY3lClTxQRW/F5fbRdS2/0P0jQdESWNwl/DqAPhIz16jPnBWsMvFPJ0X45bliuZ DbjHroVuVuFm+w== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddtgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpeekfedugfffkeehleehtdelhfekke ejhfehteehfefhffethfeivdeuhfeijedtgeenucfkphepjedurdefiedruddttddrvddv tdenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehgnh husegurghnihgvlhgvnhhgvghlrdgtohhm X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id 878021080057; Fri, 15 Jan 2021 06:31:11 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBVAeP023703; Fri, 15 Jan 2021 03:31:10 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 04/33] Reorganize LIB1ASMFUNCS object wrapper macros Date: Fri, 15 Jan 2021 03:30:32 -0800 Message-Id: <95954856eb4da1cf7bd94643d575b137f2775752.1610709584.git.gnu@danielengel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-13.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" This will make it easier to isolate changes in subsequent patches. gcc/libgcc/ChangeLog: 2021-01-14 Daniel Engel * config/arm/t-elf (LIB1ASMFUNCS): Split macros into logical groups. --- libgcc/config/arm/t-elf | 66 +++++++++++++++++++++++++++++++++-------- 1 file changed, 53 insertions(+), 13 deletions(-) diff --git a/libgcc/config/arm/t-elf b/libgcc/config/arm/t-elf index 9da6cd37054..93ea1cd8f76 100644 --- a/libgcc/config/arm/t-elf +++ b/libgcc/config/arm/t-elf @@ -14,19 +14,59 @@ LIB1ASMFUNCS += _arm_muldf3 _arm_mulsf3 endif endif # !__symbian__ -# For most CPUs we have an assembly soft-float implementations. -# However this is not true for ARMv6M. Here we want to use the soft-fp C -# implementation. The soft-fp code is only build for ARMv6M. This pulls -# in the asm implementation for other CPUs. -LIB1ASMFUNCS += _udivsi3 _divsi3 _umodsi3 _modsi3 _dvmd_tls _bb_init_func \ - _call_via_rX _interwork_call_via_rX \ - _lshrdi3 _ashrdi3 _ashldi3 \ - _arm_negdf2 _arm_addsubdf3 _arm_muldivdf3 _arm_cmpdf2 _arm_unorddf2 \ - _arm_fixdfsi _arm_fixunsdfsi \ - _arm_truncdfsf2 _arm_negsf2 _arm_addsubsf3 _arm_muldivsf3 \ - _arm_cmpsf2 _arm_unordsf2 _arm_fixsfsi _arm_fixunssfsi \ - _arm_floatdidf _arm_floatdisf _arm_floatundidf _arm_floatundisf \ - _clzsi2 _clzdi2 _ctzsi2 +# This pulls in the available assembly function implementations. +# The soft-fp code is only built for ARMv6M, since there is no +# assembly implementation here for double-precision values. + + +# Group 1: Integer function objects. +LIB1ASMFUNCS += \ + _ashldi3 \ + _ashrdi3 \ + _lshrdi3 \ + _clzdi2 \ + _clzsi2 \ + _ctzsi2 \ + _dvmd_tls \ + _divsi3 \ + _modsi3 \ + _udivsi3 \ + _umodsi3 \ + + +# Group 2: Single precision floating point function objects. +LIB1ASMFUNCS += \ + _arm_addsubsf3 \ + _arm_cmpsf2 \ + _arm_fixsfsi \ + _arm_fixunssfsi \ + _arm_floatdisf \ + _arm_floatundisf \ + _arm_muldivsf3 \ + _arm_negsf2 \ + _arm_unordsf2 \ + + +# Group 3: Double precision floating point function objects. +LIB1ASMFUNCS += \ + _arm_addsubdf3 \ + _arm_cmpdf2 \ + _arm_fixdfsi \ + _arm_fixunsdfsi \ + _arm_floatdidf \ + _arm_floatundidf \ + _arm_muldivdf3 \ + _arm_negdf2 \ + _arm_truncdfsf2 \ + _arm_unorddf2 \ + + +# Group 4: Miscellaneous function objects. +LIB1ASMFUNCS += \ + _bb_init_func \ + _call_via_rX \ + _interwork_call_via_rX \ + # Currently there is a bug somewhere in GCC's alias analysis # or scheduling code that is breaking _fpmul_parts in fp-bit.c. From patchwork Fri Jan 15 11:30:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426900 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=VDYFiHef; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=OsPD+KRj; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJt344cpz9sRK for ; Fri, 15 Jan 2021 22:31:39 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 07DCD39730C2; Fri, 15 Jan 2021 11:31:18 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id 66AD63846078 for ; Fri, 15 Jan 2021 11:31:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 66AD63846078 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 760A3D42; Fri, 15 Jan 2021 06:31:14 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 15 Jan 2021 06:31:14 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=CeEqcze6Pz7Hm mked1qzcnj11SpjwSBoX8dvhS7JYGU=; b=VDYFiHef/AvB4MrP1qwOdPXxlxxJ3 mJjzOsdfd+91Rs8L6H/M82uNND0viHMOjuTqA/DR0Waq0TcqrmgHH5FjyVLt5rhV 8L0K/r7TT665Juq3XUrA6cqPx+zkPQOYAI3gKfLzZfDLOCLf4jN3cKVTHgl4g50C ZOz6GVlgkwNeC6b5AgwqUv8dUpVfyfZDXZ6xhcHVHdhLiJNjKd5EAHXyJUmIIqp+ l9c8lPAaagE4xR+x8YxoASWCTA08avhP2qHeYt2VJb0d9B/G8SXupN2f7+0O1eJ+ gmdesOsaTOrjY5cLXWZzvZFoegRqgJx9TfMyc3ncoYI7SEkAofXQOwJFA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=CeEqcze6Pz7Hmmked1qzcnj11SpjwSBoX8dvhS7JYGU=; b=OsPD+KRj mAZbLcjn8h//sy59HVj1dffHRuisO6pQkNxWrCIVHWsoMACTDgD4ZhRXQihf71hw RmH2e3qR4gRhERDo9tZEGo7Z6nKNFe64SBdEJ+JG6u9HZDc43MZ1gp2db6g0tFeO 2LbqMzHsG3wFC67PE5ePB4IJG+K9wvFv1SaudF5XzSzrVn8TIEjwuAqXvwTxubdj HL7X1nyH40PhfsH/V5qle4rpV/l+R5NXgEMAlF+kQ1V5Qt5LtEKsg/qY8rC4A6c1 cNSTYkVSb31SiXvyhmxCST01ptlX/tOotq6AMRKjKNaZqKOteI/QyxwSVDu0ZmwG Q1j/zo3Y2F6YsA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddtgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpefhvefhgfdtgfffieekhefhudevue eludffueeuffdviedtveehvdejgeelleevveenucffohhmrghinheplhhisgdufhhunhgt shdrshgsnecukfhppeejuddrfeeirddutddtrddvvddtnecuvehluhhsthgvrhfuihiivg epudenucfrrghrrghmpehmrghilhhfrhhomhepghhnuhesuggrnhhivghlvghnghgvlhdr tghomh X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id AA0391080057; Fri, 15 Jan 2021 06:31:13 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBVC2Z023706; Fri, 15 Jan 2021 03:31:12 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 05/33] Add the __HAVE_FEATURE_IT and IT() macros Date: Fri, 15 Jan 2021 03:30:33 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-13.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" These macros complement and extend the existing do_it() macro. Together, they streamline the process of optimizing short branchless contitional sequences to support ARM, Thumb-2, and Thumb-1. The inherent architecture limitations of Thumb-1 means that writing assembly code is somewhat more tedious. And, while such code will run unmodified in an ARM or Thumb-2 enfironment, it will lack one of the key performance optimizations available there. Initially, the first idea might be to split the an instruction sequence with #ifdef(s): one path for Thumb-1 and the other for ARM/Thumb-2. This could suffice if conditional execution optimizations were rare. However, #ifdef(s) break flow of an algorithm and shift focus to the architectural differences instead of the similarities. On functions with a high percentage of conditional execution, it starts to become attractive to split everything into distinct architecture-specific function objects -- even when the underlying algorithm is identical. Additionally, duplicated code and comments (whether an individual operand, a line, or a larger block) become a future maintenance liability if the two versions aren't kept in sync. See code comments for limitations and expecated usage. gcc/libgcc/ChangeLog: 2021-01-14 Daniel Engel (__HAVE_FEATURE_IT, IT): New macros. --- libgcc/config/arm/lib1funcs.S | 68 +++++++++++++++++++++++++++++++++++ 1 file changed, 68 insertions(+) diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S index b8693be8e4f..1233b8c0992 100644 --- a/libgcc/config/arm/lib1funcs.S +++ b/libgcc/config/arm/lib1funcs.S @@ -230,6 +230,7 @@ LSYM(Lend_fde): ARM and Thumb-2. However this is only supported by recent gas, so define a set of macros to allow ARM code on older assemblers. */ #if defined(__thumb2__) +#define __HAVE_FEATURE_IT .macro do_it cond, suffix="" it\suffix \cond .endm @@ -245,6 +246,9 @@ LSYM(Lend_fde): \name \dest, \src1, \tmp .endm #else +#if !defined(__thumb__) +#define __HAVE_FEATURE_IT +#endif .macro do_it cond, suffix="" .endm .macro shift1 op, arg0, arg1, arg2 @@ -259,6 +263,70 @@ LSYM(Lend_fde): #define COND(op1, op2, cond) op1 ## op2 ## cond + +/* The IT() macro streamlines the construction of short branchless contitional + sequences that support ARM, Thumb-2, and Thumb-1. It is intended as an + extension to the .do_it macro defined above. Code not written with the + intent to support Thumb-1 need not use IT(). + + IT()'s main advantage is the minimization of syntax differences. Unified + functions can support Thumb-1 without imposiing an undue performance + penalty on ARM and Thumb-2. Writing code without duplicate instructions + and operands keeps the high level function flow clearer and should reduce + the incidence of maintenance bugs. + + Where conditional execution is supported by ARM and Thumb-2, the specified + instruction compiles with the conditional suffix 'c'. + + Where Thumb-1 and v6m do not support IT, the given instruction compiles + with the standard unified syntax suffix "s", and a preceding branch + instruction is required to implement conditional behavior. + + (Aside: The Thumb-1 "s"-suffix pattern is somewhat simplistic, since it + does not support 'cmp' or 'tst' with a non-"s" suffix. It also appends + "s" to 'mov' and 'add' with high register operands which are otherwise + legal on v6m. Use of IT() will result in a compiler error for all of + these exceptional cases, and a full #ifdef code split will be required. + However, it is unlikely that code written with Thumb-1 compatibility + in mind will use such patterns, so IT() still promises a good value.) + + Typical if/then/else usage is: + + #ifdef __HAVE_FEATURE_IT + // ARM and Thumb-2 'true' condition. + do_it c, tee + #else + // Thumb-1 'false' condition. This must be opposite the + // sense of the ARM and Thumb-2 condition, since the + // branch is taken to skip the 'true' instruction block. + b!c else_label + #endif + + // Conditional 'true' execution for all compile modes. + IT(ins1,c) op1, op2 + IT(ins2,c) op1, op2 + + #ifndef __HAVE_FEATURE_IT + // Thumb-1 branch to skip the 'else' instruction block. + // Omitted for if/then usage. + b end_label + #endif + + else_label: + // Conditional 'false' execution for all compile modes. + // Omitted for if/then usage. + IT(ins3,!c) op1, op2 + IT(ins4,!c) op1, op2 + + end_label: + // Unconditional execution resumes here. + */ +#ifdef __HAVE_FEATURE_IT + #define IT(ins,c) ins##c +#else + #define IT(ins,c) ins##s +#endif + #ifdef __ARM_EABI__ .macro ARM_LDIV0 name signed cmp r0, #0 From patchwork Fri Jan 15 11:30:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426901 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=KSXWCumm; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=mOcLiT1Z; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJt92L7Vz9sRK for ; Fri, 15 Jan 2021 22:31:45 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6FA71397307C; Fri, 15 Jan 2021 11:31:20 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id 739F23846078 for ; Fri, 15 Jan 2021 11:31:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 739F23846078 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 845F4C14; Fri, 15 Jan 2021 06:31:16 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 15 Jan 2021 06:31:16 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=U25SZ3FF3aE+L XvrovZ4tsQ/3w0m6IfYotbcyFqR420=; b=KSXWCumm+VTvxOv5lo+f9U7jOQBuy GYa/DENAEKYdUxrwUKRXCaNfisglCYp88eqdSR1zJJ+VWYxtc09eBiFJyY5COfUV b699mqQTbhTz1iVhczeff7w0K50ho4If3ITtWYtkCdgjY0BKYGzE5vdUlS2yqn2u S7veMl23jHQWU34phmnwBUUK31dXnCKac7d43QFe2QyjHHaIxTxTWBWs6bqG/Be/ Ud5l3kuWs9zePZClpt578p/zJrR78JlYi7qoMj2+Z+Wfa7XumTJ/E4r7BKaNiUES OMjN+RryKX9KbpIOTfHx8C5Jh3ybK2w8ocbzbUcHp8MzQgpQQkBT9V+uw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=U25SZ3FF3aE+LXvrovZ4tsQ/3w0m6IfYotbcyFqR420=; b=mOcLiT1Z jY8T8qZ5vE0uj1ULMuw7b4Mg/RsYinvpgWNNAymzMAdz2gn3g0IEkkxhiTO0zP7b 5+kQvb2DwWEr5aDS+sDqZup0sTkhxjSXHfyfT4T77UuBSS5InmWd6TpC5uyRe3Jl kdndZNXsD9I+J1l00fERV2v3u9gep4jNjQPzWER9DcPBqTXXMX1EwC9L0B6WalY7 /Z/1L/+zR1eRUC4YUtuPaEDq4aBSp1/Tphpk4h9A+BuF0+JAc46z+R4XSTXY8D0Y Nppn0Qh4F5p4XO30fTJjQF9iPXz9ggLMFU4ZGphsNopBx1pTiymhah2bvVzjfVQO 5ha0DN3/XatozQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddtgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpeeiteekvedukefhfefghfekheehje fhieeiuddvgfeljeeifedvueetueejffekueenucffohhmrghinheptghliidvrdhssgdp ghhnuhdrohhrghdplhhisgdufhhunhgtshdrshgsnecukfhppeejuddrfeeirddutddtrd dvvddtnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhep ghhnuhesuggrnhhivghlvghnghgvlhdrtghomh X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id C74921080059; Fri, 15 Jan 2021 06:31:15 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBVEqs023709; Fri, 15 Jan 2021 03:31:14 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 06/33] Refactor 'clz' functions into a new file Date: Fri, 15 Jan 2021 03:30:34 -0800 Message-Id: <7ba357b9910b4cddcf0982580d831d4df3f750b7.1610709584.git.gnu@danielengel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SCC_10_SHORT_WORD_LINES, SCC_5_SHORT_WORD_LINES, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" This will make it easier to isolate changes in subsequent patches. gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/lib1funcs.S (__clzsi2i, __clzdi2): Moved to ... * config/arm/clz2.S: New file. --- libgcc/config/arm/clz2.S | 145 ++++++++++++++++++++++++++++++++++ libgcc/config/arm/lib1funcs.S | 123 +--------------------------- 2 files changed, 146 insertions(+), 122 deletions(-) create mode 100644 libgcc/config/arm/clz2.S diff --git a/libgcc/config/arm/clz2.S b/libgcc/config/arm/clz2.S new file mode 100644 index 00000000000..2ad9a81892c --- /dev/null +++ b/libgcc/config/arm/clz2.S @@ -0,0 +1,145 @@ +/* Copyright (C) 1995-2021 Free Software Foundation, Inc. + +This file is free software; you can redistribute it and/or modify it +under the terms of the GNU General Public License as published by the +Free Software Foundation; either version 3, or (at your option) any +later version. + +This file is distributed in the hope that it will be useful, but +WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +General Public License for more details. + +Under Section 7 of GPL version 3, you are granted additional +permissions described in the GCC Runtime Library Exception, version +3.1, as published by the Free Software Foundation. + +You should have received a copy of the GNU General Public License and +a copy of the GCC Runtime Library Exception along with this program; +see the files COPYING3 and COPYING.RUNTIME respectively. If not, see +. */ + + +#ifdef L_clzsi2 +#ifdef NOT_ISA_TARGET_32BIT +FUNC_START clzsi2 + movs r1, #28 + movs r3, #1 + lsls r3, r3, #16 + cmp r0, r3 /* 0x10000 */ + bcc 2f + lsrs r0, r0, #16 + subs r1, r1, #16 +2: lsrs r3, r3, #8 + cmp r0, r3 /* #0x100 */ + bcc 2f + lsrs r0, r0, #8 + subs r1, r1, #8 +2: lsrs r3, r3, #4 + cmp r0, r3 /* #0x10 */ + bcc 2f + lsrs r0, r0, #4 + subs r1, r1, #4 +2: adr r2, 1f + ldrb r0, [r2, r0] + adds r0, r0, r1 + bx lr +.align 2 +1: +.byte 4, 3, 2, 2, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0 + FUNC_END clzsi2 +#else +ARM_FUNC_START clzsi2 +# if defined (__ARM_FEATURE_CLZ) + clz r0, r0 + RET +# else + mov r1, #28 + cmp r0, #0x10000 + do_it cs, t + movcs r0, r0, lsr #16 + subcs r1, r1, #16 + cmp r0, #0x100 + do_it cs, t + movcs r0, r0, lsr #8 + subcs r1, r1, #8 + cmp r0, #0x10 + do_it cs, t + movcs r0, r0, lsr #4 + subcs r1, r1, #4 + adr r2, 1f + ldrb r0, [r2, r0] + add r0, r0, r1 + RET +.align 2 +1: +.byte 4, 3, 2, 2, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0 +# endif /* !defined (__ARM_FEATURE_CLZ) */ + FUNC_END clzsi2 +#endif +#endif /* L_clzsi2 */ + +#ifdef L_clzdi2 +#if !defined (__ARM_FEATURE_CLZ) + +# ifdef NOT_ISA_TARGET_32BIT +FUNC_START clzdi2 + push {r4, lr} + cmp xxh, #0 + bne 1f +# ifdef __ARMEB__ + movs r0, xxl + bl __clzsi2 + adds r0, r0, #32 + b 2f +1: + bl __clzsi2 +# else + bl __clzsi2 + adds r0, r0, #32 + b 2f +1: + movs r0, xxh + bl __clzsi2 +# endif +2: + pop {r4, pc} +# else /* NOT_ISA_TARGET_32BIT */ +ARM_FUNC_START clzdi2 + do_push {r4, lr} + cmp xxh, #0 + bne 1f +# ifdef __ARMEB__ + mov r0, xxl + bl __clzsi2 + add r0, r0, #32 + b 2f +1: + bl __clzsi2 +# else + bl __clzsi2 + add r0, r0, #32 + b 2f +1: + mov r0, xxh + bl __clzsi2 +# endif +2: + RETLDM r4 + FUNC_END clzdi2 +# endif /* NOT_ISA_TARGET_32BIT */ + +#else /* defined (__ARM_FEATURE_CLZ) */ + +ARM_FUNC_START clzdi2 + cmp xxh, #0 + do_it eq, et + clzeq r0, xxl + clzne r0, xxh + addeq r0, r0, #32 + RET + FUNC_END clzdi2 + +#endif +#endif /* L_clzdi2 */ + diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S index 1233b8c0992..d92f73ba0c9 100644 --- a/libgcc/config/arm/lib1funcs.S +++ b/libgcc/config/arm/lib1funcs.S @@ -1803,128 +1803,7 @@ LSYM(Lover12): #endif /* __symbian__ */ -#ifdef L_clzsi2 -#ifdef NOT_ISA_TARGET_32BIT -FUNC_START clzsi2 - movs r1, #28 - movs r3, #1 - lsls r3, r3, #16 - cmp r0, r3 /* 0x10000 */ - bcc 2f - lsrs r0, r0, #16 - subs r1, r1, #16 -2: lsrs r3, r3, #8 - cmp r0, r3 /* #0x100 */ - bcc 2f - lsrs r0, r0, #8 - subs r1, r1, #8 -2: lsrs r3, r3, #4 - cmp r0, r3 /* #0x10 */ - bcc 2f - lsrs r0, r0, #4 - subs r1, r1, #4 -2: adr r2, 1f - ldrb r0, [r2, r0] - adds r0, r0, r1 - bx lr -.align 2 -1: -.byte 4, 3, 2, 2, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0 - FUNC_END clzsi2 -#else -ARM_FUNC_START clzsi2 -# if defined (__ARM_FEATURE_CLZ) - clz r0, r0 - RET -# else - mov r1, #28 - cmp r0, #0x10000 - do_it cs, t - movcs r0, r0, lsr #16 - subcs r1, r1, #16 - cmp r0, #0x100 - do_it cs, t - movcs r0, r0, lsr #8 - subcs r1, r1, #8 - cmp r0, #0x10 - do_it cs, t - movcs r0, r0, lsr #4 - subcs r1, r1, #4 - adr r2, 1f - ldrb r0, [r2, r0] - add r0, r0, r1 - RET -.align 2 -1: -.byte 4, 3, 2, 2, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0 -# endif /* !defined (__ARM_FEATURE_CLZ) */ - FUNC_END clzsi2 -#endif -#endif /* L_clzsi2 */ - -#ifdef L_clzdi2 -#if !defined (__ARM_FEATURE_CLZ) - -# ifdef NOT_ISA_TARGET_32BIT -FUNC_START clzdi2 - push {r4, lr} - cmp xxh, #0 - bne 1f -# ifdef __ARMEB__ - movs r0, xxl - bl __clzsi2 - adds r0, r0, #32 - b 2f -1: - bl __clzsi2 -# else - bl __clzsi2 - adds r0, r0, #32 - b 2f -1: - movs r0, xxh - bl __clzsi2 -# endif -2: - pop {r4, pc} -# else /* NOT_ISA_TARGET_32BIT */ -ARM_FUNC_START clzdi2 - do_push {r4, lr} - cmp xxh, #0 - bne 1f -# ifdef __ARMEB__ - mov r0, xxl - bl __clzsi2 - add r0, r0, #32 - b 2f -1: - bl __clzsi2 -# else - bl __clzsi2 - add r0, r0, #32 - b 2f -1: - mov r0, xxh - bl __clzsi2 -# endif -2: - RETLDM r4 - FUNC_END clzdi2 -# endif /* NOT_ISA_TARGET_32BIT */ - -#else /* defined (__ARM_FEATURE_CLZ) */ - -ARM_FUNC_START clzdi2 - cmp xxh, #0 - do_it eq, et - clzeq r0, xxl - clzne r0, xxh - addeq r0, r0, #32 - RET - FUNC_END clzdi2 - -#endif -#endif /* L_clzdi2 */ +#include "clz2.S" #ifdef L_ctzsi2 #ifdef NOT_ISA_TARGET_32BIT From patchwork Fri Jan 15 11:30:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426902 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=EvxQ+hUd; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=dDI705nX; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJtF2tKjz9sSC for ; Fri, 15 Jan 2021 22:31:49 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D477C397307B; Fri, 15 Jan 2021 11:31:21 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id DAD133973049 for ; Fri, 15 Jan 2021 11:31:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org DAD133973049 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute6.internal (compute6.nyi.internal [10.202.2.46]) by mailout.west.internal (Postfix) with ESMTP id EA24AE07; Fri, 15 Jan 2021 06:31:18 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute6.internal (MEProxy); Fri, 15 Jan 2021 06:31:19 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=0KhhwKgFOewm0 VJh6lImUMOxWZozsi/bMx5noPDLSyo=; b=EvxQ+hUd0H1/p78FcQ5sPnHRTG/La cWREywG5l+m/9jZjsXC5EwI4IhOt3X7zaVdSM/Y3gDvdeAociyrDVZRRi4/7pZPU teuvjfcc8N3n9tL3Vu0ZbuIkF849a05XJRMGETQBESZuuyYl/MWt6m/N1h2LFqxM dhF+DRpoqF3zvO/r6UXNi52VifEPMoDaHniFNtN5WNJvSNik8xeZTAiEqwWY5upN 8p3/Jvx0aZrwhDspS/q5QbyvXLCWdduExowu8kSVzo3S34qNU7iDx3eZY+8ppKOr 6u3+yY3mJ38dTsGmkcgP0HtI8fBOnV48gxHYpjlmdwFrcG7AcIsuw22tQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=0KhhwKgFOewm0VJh6lImUMOxWZozsi/bMx5noPDLSyo=; b=dDI705nX hMAu8ev4hgUHrn1lDwFPYYdeP0wIkI+BSycOtzAL1UVIz7YyVXXxdjECUSUDA8y5 LqYHgzGXDHLYpXnCwBod+yK6OOmLumYb7XGRAgqv7R5EujNpU6I/4N4VG1izNdcd YWXrcB/ym1kF+JyIwG6Ea1zGFYPE73nU0V1K3Ale17roTQ7sgTJZ+eLgEDuIdZf8 ntBJ30i+meQGt30RE4dF9RJ6sR03dKCK7QPKnEAjQvzQHKooXYU/FBZT0hnNcDVs WiePl6z//RF9kedw6tcCB3+V7OnmSNemvoN6nIbjSpPSH780xXHrBYwNjtjCzl3f BcOBK74H6xOLEg== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddthecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpeeiudegudeuhedtfefhgfeftdegff eijeetieeltefhgffgueelvefggeefvddvgeenucffohhmrghinheptghtiidvrdhssgdp ghhnuhdrohhrghdplhhisgdufhhunhgtshdrshgsnecukfhppeejuddrfeeirddutddtrd dvvddtnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhep ghhnuhesuggrnhhivghlvghnghgvlhdrtghomh X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id E5629108005B; Fri, 15 Jan 2021 06:31:17 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBVGD3023712; Fri, 15 Jan 2021 03:31:16 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 07/33] Refactor 'ctz' functions into a new file Date: Fri, 15 Jan 2021 03:30:35 -0800 Message-Id: <3a0975cb7141e46cf0cf20a6363422b775e726df.1610709584.git.gnu@danielengel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_ASCII_DIVIDERS, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SCC_10_SHORT_WORD_LINES, SCC_5_SHORT_WORD_LINES, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" This will make it easier to isolate changes in subsequent patches. gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/lib1funcs.S (__ctzsi2): Moved to ... * config/arm/ctz2.S: New file. --- libgcc/config/arm/ctz2.S | 86 +++++++++++++++++++++++++++++++++++ libgcc/config/arm/lib1funcs.S | 65 +------------------------- 2 files changed, 87 insertions(+), 64 deletions(-) create mode 100644 libgcc/config/arm/ctz2.S diff --git a/libgcc/config/arm/ctz2.S b/libgcc/config/arm/ctz2.S new file mode 100644 index 00000000000..8702c9afb94 --- /dev/null +++ b/libgcc/config/arm/ctz2.S @@ -0,0 +1,86 @@ +/* Copyright (C) 1995-2021 Free Software Foundation, Inc. + +This file is free software; you can redistribute it and/or modify it +under the terms of the GNU General Public License as published by the +Free Software Foundation; either version 3, or (at your option) any +later version. + +This file is distributed in the hope that it will be useful, but +WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +General Public License for more details. + +Under Section 7 of GPL version 3, you are granted additional +permissions described in the GCC Runtime Library Exception, version +3.1, as published by the Free Software Foundation. + +You should have received a copy of the GNU General Public License and +a copy of the GCC Runtime Library Exception along with this program; +see the files COPYING3 and COPYING.RUNTIME respectively. If not, see +. */ + + +#ifdef L_ctzsi2 +#ifdef NOT_ISA_TARGET_32BIT +FUNC_START ctzsi2 + negs r1, r0 + ands r0, r0, r1 + movs r1, #28 + movs r3, #1 + lsls r3, r3, #16 + cmp r0, r3 /* 0x10000 */ + bcc 2f + lsrs r0, r0, #16 + subs r1, r1, #16 +2: lsrs r3, r3, #8 + cmp r0, r3 /* #0x100 */ + bcc 2f + lsrs r0, r0, #8 + subs r1, r1, #8 +2: lsrs r3, r3, #4 + cmp r0, r3 /* #0x10 */ + bcc 2f + lsrs r0, r0, #4 + subs r1, r1, #4 +2: adr r2, 1f + ldrb r0, [r2, r0] + subs r0, r0, r1 + bx lr +.align 2 +1: +.byte 27, 28, 29, 29, 30, 30, 30, 30, 31, 31, 31, 31, 31, 31, 31, 31 + FUNC_END ctzsi2 +#else +ARM_FUNC_START ctzsi2 + rsb r1, r0, #0 + and r0, r0, r1 +# if defined (__ARM_FEATURE_CLZ) + clz r0, r0 + rsb r0, r0, #31 + RET +# else + mov r1, #28 + cmp r0, #0x10000 + do_it cs, t + movcs r0, r0, lsr #16 + subcs r1, r1, #16 + cmp r0, #0x100 + do_it cs, t + movcs r0, r0, lsr #8 + subcs r1, r1, #8 + cmp r0, #0x10 + do_it cs, t + movcs r0, r0, lsr #4 + subcs r1, r1, #4 + adr r2, 1f + ldrb r0, [r2, r0] + sub r0, r0, r1 + RET +.align 2 +1: +.byte 27, 28, 29, 29, 30, 30, 30, 30, 31, 31, 31, 31, 31, 31, 31, 31 +# endif /* !defined (__ARM_FEATURE_CLZ) */ + FUNC_END ctzsi2 +#endif +#endif /* L_clzsi2 */ + diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S index d92f73ba0c9..b1df00ac597 100644 --- a/libgcc/config/arm/lib1funcs.S +++ b/libgcc/config/arm/lib1funcs.S @@ -1804,70 +1804,7 @@ LSYM(Lover12): #endif /* __symbian__ */ #include "clz2.S" - -#ifdef L_ctzsi2 -#ifdef NOT_ISA_TARGET_32BIT -FUNC_START ctzsi2 - negs r1, r0 - ands r0, r0, r1 - movs r1, #28 - movs r3, #1 - lsls r3, r3, #16 - cmp r0, r3 /* 0x10000 */ - bcc 2f - lsrs r0, r0, #16 - subs r1, r1, #16 -2: lsrs r3, r3, #8 - cmp r0, r3 /* #0x100 */ - bcc 2f - lsrs r0, r0, #8 - subs r1, r1, #8 -2: lsrs r3, r3, #4 - cmp r0, r3 /* #0x10 */ - bcc 2f - lsrs r0, r0, #4 - subs r1, r1, #4 -2: adr r2, 1f - ldrb r0, [r2, r0] - subs r0, r0, r1 - bx lr -.align 2 -1: -.byte 27, 28, 29, 29, 30, 30, 30, 30, 31, 31, 31, 31, 31, 31, 31, 31 - FUNC_END ctzsi2 -#else -ARM_FUNC_START ctzsi2 - rsb r1, r0, #0 - and r0, r0, r1 -# if defined (__ARM_FEATURE_CLZ) - clz r0, r0 - rsb r0, r0, #31 - RET -# else - mov r1, #28 - cmp r0, #0x10000 - do_it cs, t - movcs r0, r0, lsr #16 - subcs r1, r1, #16 - cmp r0, #0x100 - do_it cs, t - movcs r0, r0, lsr #8 - subcs r1, r1, #8 - cmp r0, #0x10 - do_it cs, t - movcs r0, r0, lsr #4 - subcs r1, r1, #4 - adr r2, 1f - ldrb r0, [r2, r0] - sub r0, r0, r1 - RET -.align 2 -1: -.byte 27, 28, 29, 29, 30, 30, 30, 30, 31, 31, 31, 31, 31, 31, 31, 31 -# endif /* !defined (__ARM_FEATURE_CLZ) */ - FUNC_END ctzsi2 -#endif -#endif /* L_clzsi2 */ +#include "ctz2.S" /* ------------------------------------------------------------------------ */ /* These next two sections are here despite the fact that they contain Thumb From patchwork Fri Jan 15 11:30:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426903 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=hrGiT/wY; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=ChSC0UOI; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJtL15Fmz9sVR for ; Fri, 15 Jan 2021 22:31:54 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B292939730DD; Fri, 15 Jan 2021 11:31:24 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id D1A503973076 for ; Fri, 15 Jan 2021 11:31:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org D1A503973076 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id E2B81EBD; Fri, 15 Jan 2021 06:31:20 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 15 Jan 2021 06:31:21 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=QnOnz139JL5Ud eH44x7ouCB1k0JcEkJvYWyG39Sb7JE=; b=hrGiT/wYjfNJhkH/vTv2gSf4uO9tm +nFTuMGJ2w4fPvH/OdzPabjyG2WjnHZ4+ZH9CdNI1wp7Hvzi5/D+nYd3C2gpqHXQ dD+IyOXpcWY3S83sT+imszdaniXi/cFhO5VSjcKWWcs2mDmCa4nuSafzOFiABO7e Az+j25EoingzTu2TV2b1qgZSMeUKEjK9tow4cCYWLaa4Xoybt0nlgFSWhFBpP6PV stYktUS0c9dBLhx7UubSga4I2EII/Z9YEEYfytSLweemXNtnhM+AUnRogtOxwy2L 62NTCjPnCiBgIDp86+hQT645W5TwUthr5XC4SnCLjjmGLvToOEPn6AgbQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=QnOnz139JL5UdeH44x7ouCB1k0JcEkJvYWyG39Sb7JE=; b=ChSC0UOI QaswLkaJYhzvhD6EF22R66GmHL7/WFyLb+OZXBEL+QsIWn7D512bcrCJFp7m7b8W 0jyq/xtFjNWxun0pRLtEVagR+PnsOTbMwGGuWZqyyNBJpNjRW9BWE+3INYmeUfWy SzCFydnNtMVYX3+m8E5Gu2nV0IID1zzIf9yGe8JAEuxEZRtzM8NkGiUDNlUOdJiJ 09BHWkMeiepbNZEqetCzLNwQ6aqjb6QfhEWyDxtVL+ABeqRbPYNESS1zsIGcNhv8 aySvYzdf6Gzh7ZQ06uLOn31b1RmC+Mzc+SkrsaiYwG/Q/fgeb89KHa5GtDJb/K3y 5h67HS3UxfyMQg== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddtgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpefhfeeukeehvddvteejtdejveejvd dugffgueejveeukedtleeijeffgedtuedvueenucffohhmrghinheplhhshhhifhhtrdhs sgdpghhnuhdrohhrghdplhhisgdufhhunhgtshdrshgsnecukfhppeejuddrfeeirddutd dtrddvvddtnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhho mhepghhnuhesuggrnhhivghlvghnghgvlhdrtghomh X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id 182591080059; Fri, 15 Jan 2021 06:31:19 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBVIhI023715; Fri, 15 Jan 2021 03:31:18 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 08/33] Refactor 64-bit shift functions into a new file Date: Fri, 15 Jan 2021 03:30:36 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SCC_10_SHORT_WORD_LINES, SCC_20_SHORT_WORD_LINES, SCC_5_SHORT_WORD_LINES, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" This will make it easier to isolate changes in subsequent patches. gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/lib1funcs.S (__ashldi3, __ashrdi3, __lshldi3): Moved to ... * config/arm/eabi/lshift.S: New file. --- libgcc/config/arm/eabi/lshift.S | 123 ++++++++++++++++++++++++++++++++ libgcc/config/arm/lib1funcs.S | 103 +------------------------- 2 files changed, 124 insertions(+), 102 deletions(-) create mode 100644 libgcc/config/arm/eabi/lshift.S diff --git a/libgcc/config/arm/eabi/lshift.S b/libgcc/config/arm/eabi/lshift.S new file mode 100644 index 00000000000..0974a72c377 --- /dev/null +++ b/libgcc/config/arm/eabi/lshift.S @@ -0,0 +1,123 @@ +/* Copyright (C) 1995-2021 Free Software Foundation, Inc. + +This file is free software; you can redistribute it and/or modify it +under the terms of the GNU General Public License as published by the +Free Software Foundation; either version 3, or (at your option) any +later version. + +This file is distributed in the hope that it will be useful, but +WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +General Public License for more details. + +Under Section 7 of GPL version 3, you are granted additional +permissions described in the GCC Runtime Library Exception, version +3.1, as published by the Free Software Foundation. + +You should have received a copy of the GNU General Public License and +a copy of the GCC Runtime Library Exception along with this program; +see the files COPYING3 and COPYING.RUNTIME respectively. If not, see +. */ + + +#ifdef L_lshrdi3 + + FUNC_START lshrdi3 + FUNC_ALIAS aeabi_llsr lshrdi3 + +#ifdef __thumb__ + lsrs al, r2 + movs r3, ah + lsrs ah, r2 + mov ip, r3 + subs r2, #32 + lsrs r3, r2 + orrs al, r3 + negs r2, r2 + mov r3, ip + lsls r3, r2 + orrs al, r3 + RET +#else + subs r3, r2, #32 + rsb ip, r2, #32 + movmi al, al, lsr r2 + movpl al, ah, lsr r3 + orrmi al, al, ah, lsl ip + mov ah, ah, lsr r2 + RET +#endif + FUNC_END aeabi_llsr + FUNC_END lshrdi3 + +#endif + +#ifdef L_ashrdi3 + + FUNC_START ashrdi3 + FUNC_ALIAS aeabi_lasr ashrdi3 + +#ifdef __thumb__ + lsrs al, r2 + movs r3, ah + asrs ah, r2 + subs r2, #32 + @ If r2 is negative at this point the following step would OR + @ the sign bit into all of AL. That's not what we want... + bmi 1f + mov ip, r3 + asrs r3, r2 + orrs al, r3 + mov r3, ip +1: + negs r2, r2 + lsls r3, r2 + orrs al, r3 + RET +#else + subs r3, r2, #32 + rsb ip, r2, #32 + movmi al, al, lsr r2 + movpl al, ah, asr r3 + orrmi al, al, ah, lsl ip + mov ah, ah, asr r2 + RET +#endif + + FUNC_END aeabi_lasr + FUNC_END ashrdi3 + +#endif + +#ifdef L_ashldi3 + + FUNC_START ashldi3 + FUNC_ALIAS aeabi_llsl ashldi3 + +#ifdef __thumb__ + lsls ah, r2 + movs r3, al + lsls al, r2 + mov ip, r3 + subs r2, #32 + lsls r3, r2 + orrs ah, r3 + negs r2, r2 + mov r3, ip + lsrs r3, r2 + orrs ah, r3 + RET +#else + subs r3, r2, #32 + rsb ip, r2, #32 + movmi ah, ah, lsl r2 + movpl ah, al, lsl r3 + orrmi ah, ah, al, lsr ip + mov al, al, lsl r2 + RET +#endif + FUNC_END aeabi_llsl + FUNC_END ashldi3 + +#endif + diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S index b1df00ac597..7ac50230725 100644 --- a/libgcc/config/arm/lib1funcs.S +++ b/libgcc/config/arm/lib1funcs.S @@ -1699,108 +1699,7 @@ LSYM(Lover12): /* Prevent __aeabi double-word shifts from being produced on SymbianOS. */ #ifndef __symbian__ - -#ifdef L_lshrdi3 - - FUNC_START lshrdi3 - FUNC_ALIAS aeabi_llsr lshrdi3 - -#ifdef __thumb__ - lsrs al, r2 - movs r3, ah - lsrs ah, r2 - mov ip, r3 - subs r2, #32 - lsrs r3, r2 - orrs al, r3 - negs r2, r2 - mov r3, ip - lsls r3, r2 - orrs al, r3 - RET -#else - subs r3, r2, #32 - rsb ip, r2, #32 - movmi al, al, lsr r2 - movpl al, ah, lsr r3 - orrmi al, al, ah, lsl ip - mov ah, ah, lsr r2 - RET -#endif - FUNC_END aeabi_llsr - FUNC_END lshrdi3 - -#endif - -#ifdef L_ashrdi3 - - FUNC_START ashrdi3 - FUNC_ALIAS aeabi_lasr ashrdi3 - -#ifdef __thumb__ - lsrs al, r2 - movs r3, ah - asrs ah, r2 - subs r2, #32 - @ If r2 is negative at this point the following step would OR - @ the sign bit into all of AL. That's not what we want... - bmi 1f - mov ip, r3 - asrs r3, r2 - orrs al, r3 - mov r3, ip -1: - negs r2, r2 - lsls r3, r2 - orrs al, r3 - RET -#else - subs r3, r2, #32 - rsb ip, r2, #32 - movmi al, al, lsr r2 - movpl al, ah, asr r3 - orrmi al, al, ah, lsl ip - mov ah, ah, asr r2 - RET -#endif - - FUNC_END aeabi_lasr - FUNC_END ashrdi3 - -#endif - -#ifdef L_ashldi3 - - FUNC_START ashldi3 - FUNC_ALIAS aeabi_llsl ashldi3 - -#ifdef __thumb__ - lsls ah, r2 - movs r3, al - lsls al, r2 - mov ip, r3 - subs r2, #32 - lsls r3, r2 - orrs ah, r3 - negs r2, r2 - mov r3, ip - lsrs r3, r2 - orrs ah, r3 - RET -#else - subs r3, r2, #32 - rsb ip, r2, #32 - movmi ah, ah, lsl r2 - movpl ah, al, lsl r3 - orrmi ah, ah, al, lsr ip - mov al, al, lsl r2 - RET -#endif - FUNC_END aeabi_llsl - FUNC_END ashldi3 - -#endif - +#include "eabi/lshift.S" #endif /* __symbian__ */ #include "clz2.S" From patchwork Fri Jan 15 11:30:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426904 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=a7g9TxLC; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=ZQE8onPw; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJtQ2G09z9sSC for ; Fri, 15 Jan 2021 22:31:58 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6DCD439730E7; Fri, 15 Jan 2021 11:31:27 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id 0775B3973072 for ; Fri, 15 Jan 2021 11:31:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 0775B3973072 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 13370F2A; Fri, 15 Jan 2021 06:31:23 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 15 Jan 2021 06:31:23 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=izCF6Wqqul7Bi /V9u3hN2ju2xprphF6a9iPj7XIYhB4=; b=a7g9TxLCf4kfpEsM41CewsgBGzqfM l0KGgrJdU5uoXr+ghQrceSt9jjQkS2JxFe53EqAePjAHFDDlOjjxVbpy0abeRtEC xiIAZqj6XtrU6/bV5SHCzhbAduDi8VMs9mU5Wl9zV6LKs+vpIBsAoBgGAo3OCSR0 ZrGXBNw1QXDtNTpA/pWFLWPq9yE627s2t4sUPLT9yilV6zXm9fxwv+0MRThBhGmx /15TrUx5bpHjGiipU+pPQsD+ciLKtWsmaTRnbqSYWkKKo7Kc7HAmfpwLFf7n8PrC XhAjkvJh70Xt9hhBs2GC/938w26O0HJVinrtyI3MEaXWNoqju7ZlwiWgQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=izCF6Wqqul7Bi/V9u3hN2ju2xprphF6a9iPj7XIYhB4=; b=ZQE8onPw 4Lkx4ieq1jRPCeRGrQgSCHlSRFZ9QCP2a2+bOU3dmR7mDIjkWbay8MtPiZcXrCeE TB9Pw4SdF2JDwZ4uxHjBwPXEoiVPxOO3h/K2MQ7y367YSLO15Al07T3Q6YH2ce8m xDyg6pznChuSaP4jkyY94z60+0kmUkK8FkhRuP2ZU78f+D60tNTwv+dQ/zbbTDAZ kpycoKVK22VsKKFMIeJUAeYfTuSaVuAYfhvC+jI1cqgHp/P9ZYns+Su58Qq02ugM DFQ4uqHD1+zSQM5p2Fyd9jKjEj+cSaNVVDhKYU5AKaWkfoTHXN4zS16I+JsYjzMH uvmvQa0sZ7xMHQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddtgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpefhkeefgffhhefhuedviedthfevte eigeekfffhieeludduudetffekudejiedtieenucffohhmrghinheptghliidvrdhssgdp ghhnuhdrohhrghenucfkphepjedurdefiedruddttddrvddvtdenucevlhhushhtvghruf hiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehgnhhusegurghnihgvlhgvnhhg vghlrdgtohhm X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id 336431080059; Fri, 15 Jan 2021 06:31:22 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBVL9K023718; Fri, 15 Jan 2021 03:31:21 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 09/33] Import 'clz' functions from the CM0 library Date: Fri, 15 Jan 2021 03:30:37 -0800 Message-Id: <66d4e6af64a7066206b5d15daed58c5878358416.1610709584.git.gnu@danielengel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-13.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SCC_5_SHORT_WORD_LINES, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" On architectures without __ARM_FEATURE_CLZ, this version combines __clzdi2() with __clzsi2() into a single object with an efficient tail call. Also, this version merges the formerly separate Thumb and ARM code implementations into a unified instruction sequence. This change significantly improves Thumb performance without affecting ARM performance. Finally, this version adds a new __OPTIMIZE_SIZE__ build option (binary search loop). There is no change to the code for architectures with __ARM_FEATURE_CLZ. gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/bits/clz2.S (__clzsi2, __clzdi2): Reduced code size on architectures without __ARM_FEATURE_CLZ. * config/arm/t-elf (LIB1ASMFUNCS): Moved _clzsi2 to new weak roup. --- libgcc/config/arm/clz2.S | 362 +++++++++++++++++++++++++-------------- libgcc/config/arm/t-elf | 7 +- 2 files changed, 236 insertions(+), 133 deletions(-) diff --git a/libgcc/config/arm/clz2.S b/libgcc/config/arm/clz2.S index 2ad9a81892c..dc246708a82 100644 --- a/libgcc/config/arm/clz2.S +++ b/libgcc/config/arm/clz2.S @@ -1,145 +1,243 @@ -/* Copyright (C) 1995-2021 Free Software Foundation, Inc. +/* clz2.S: Cortex M0 optimized 'clz' functions -This file is free software; you can redistribute it and/or modify it -under the terms of the GNU General Public License as published by the -Free Software Foundation; either version 3, or (at your option) any -later version. + Copyright (C) 2018-2021 Free Software Foundation, Inc. + Contributed by Daniel Engel (gnu@danielengel.com) -This file is distributed in the hope that it will be useful, but -WITHOUT ANY WARRANTY; without even the implied warranty of -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU -General Public License for more details. + This file is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the + Free Software Foundation; either version 3, or (at your option) any + later version. -Under Section 7 of GPL version 3, you are granted additional -permissions described in the GCC Runtime Library Exception, version -3.1, as published by the Free Software Foundation. + This file is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. -You should have received a copy of the GNU General Public License and -a copy of the GCC Runtime Library Exception along with this program; -see the files COPYING3 and COPYING.RUNTIME respectively. If not, see -. */ + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + + +#if defined(__ARM_FEATURE_CLZ) && __ARM_FEATURE_CLZ + +#ifdef L_clzdi2 + +// int __clzdi2(long long) +// Counts leading zero bits in $r1:$r0. +// Returns the result in $r0. +FUNC_START_SECTION clzdi2 .text.sorted.libgcc.clz2.clzdi2 + CFI_START_FUNCTION + + // Moved here from lib1funcs.S + cmp xxh, #0 + do_it eq, et + clzeq r0, xxl + clzne r0, xxh + addeq r0, #32 + RET + + CFI_END_FUNCTION +FUNC_END clzdi2 + +#endif /* L_clzdi2 */ #ifdef L_clzsi2 -#ifdef NOT_ISA_TARGET_32BIT -FUNC_START clzsi2 - movs r1, #28 - movs r3, #1 - lsls r3, r3, #16 - cmp r0, r3 /* 0x10000 */ - bcc 2f - lsrs r0, r0, #16 - subs r1, r1, #16 -2: lsrs r3, r3, #8 - cmp r0, r3 /* #0x100 */ - bcc 2f - lsrs r0, r0, #8 - subs r1, r1, #8 -2: lsrs r3, r3, #4 - cmp r0, r3 /* #0x10 */ - bcc 2f - lsrs r0, r0, #4 - subs r1, r1, #4 -2: adr r2, 1f - ldrb r0, [r2, r0] - adds r0, r0, r1 - bx lr -.align 2 -1: -.byte 4, 3, 2, 2, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0 - FUNC_END clzsi2 -#else -ARM_FUNC_START clzsi2 -# if defined (__ARM_FEATURE_CLZ) - clz r0, r0 - RET -# else - mov r1, #28 - cmp r0, #0x10000 - do_it cs, t - movcs r0, r0, lsr #16 - subcs r1, r1, #16 - cmp r0, #0x100 - do_it cs, t - movcs r0, r0, lsr #8 - subcs r1, r1, #8 - cmp r0, #0x10 - do_it cs, t - movcs r0, r0, lsr #4 - subcs r1, r1, #4 - adr r2, 1f - ldrb r0, [r2, r0] - add r0, r0, r1 - RET -.align 2 -1: -.byte 4, 3, 2, 2, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0 -# endif /* !defined (__ARM_FEATURE_CLZ) */ - FUNC_END clzsi2 -#endif + +// int __clzsi2(int) +// Counts leading zero bits in $r0. +// Returns the result in $r0. +FUNC_START_SECTION clzsi2 .text.sorted.libgcc.clz2.clzsi2 + CFI_START_FUNCTION + + // Moved here from lib1funcs.S + clz r0, r0 + RET + + CFI_END_FUNCTION +FUNC_END clzsi2 + #endif /* L_clzsi2 */ +#else /* !__ARM_FEATURE_CLZ */ + #ifdef L_clzdi2 -#if !defined (__ARM_FEATURE_CLZ) - -# ifdef NOT_ISA_TARGET_32BIT -FUNC_START clzdi2 - push {r4, lr} - cmp xxh, #0 - bne 1f -# ifdef __ARMEB__ - movs r0, xxl - bl __clzsi2 - adds r0, r0, #32 - b 2f -1: - bl __clzsi2 -# else - bl __clzsi2 - adds r0, r0, #32 - b 2f -1: - movs r0, xxh - bl __clzsi2 -# endif -2: - pop {r4, pc} -# else /* NOT_ISA_TARGET_32BIT */ -ARM_FUNC_START clzdi2 - do_push {r4, lr} - cmp xxh, #0 - bne 1f -# ifdef __ARMEB__ - mov r0, xxl - bl __clzsi2 - add r0, r0, #32 - b 2f -1: - bl __clzsi2 -# else - bl __clzsi2 - add r0, r0, #32 - b 2f -1: - mov r0, xxh - bl __clzsi2 -# endif -2: - RETLDM r4 - FUNC_END clzdi2 -# endif /* NOT_ISA_TARGET_32BIT */ - -#else /* defined (__ARM_FEATURE_CLZ) */ - -ARM_FUNC_START clzdi2 - cmp xxh, #0 - do_it eq, et - clzeq r0, xxl - clzne r0, xxh - addeq r0, r0, #32 - RET - FUNC_END clzdi2 -#endif +// int __clzdi2(long long) +// Counts leading zero bits in $r1:$r0. +// Returns the result in $r0. +// Uses $r2 and possibly $r3 as scratch space. +FUNC_START_SECTION clzdi2 .text.sorted.libgcc.clz2.clzdi2 + CFI_START_FUNCTION + + #if defined(__ARMEB__) && __ARMEB__ + // Check if the upper word is zero. + cmp r0, #0 + + // The upper word is non-zero, so calculate __clzsi2(upper). + bne SYM(__clzsi2) + + // The upper word is zero, so calculate 32 + __clzsi2(lower). + movs r2, #64 + movs r0, r1 + b SYM(__internal_clzsi2) + + #else /* !__ARMEB__ */ + // Assume all the bits in the argument are zero. + movs r2, #64 + + // Check if the upper word is zero. + cmp r1, #0 + + // The upper word is zero, so calculate 32 + __clzsi2(lower). + beq SYM(__internal_clzsi2) + + // The upper word is non-zero, so set up __clzsi2(upper). + // Then fall through. + movs r0, r1 + + #endif /* !__ARMEB__ */ + #endif /* L_clzdi2 */ + +// The bitwise implementation of __clzdi2() tightly couples with __clzsi2(), +// such that instructions must appear consecutively in the same memory +// section for proper flow control. However, this construction inhibits +// the ability to discard __clzdi2() when only using __clzsi2(). +// Therefore, this block configures __clzsi2() for compilation twice. +// The first version is a minimal standalone implementation, and the second +// version is the continuation of __clzdi2(). The standalone version must +// be declared WEAK, so that the combined version can supersede it and +// provide both symbols when required. +// '_clzsi2' should appear before '_clzdi2' in LIB1ASMFUNCS. +#if defined(L_clzsi2) || defined(L_clzdi2) + +#ifdef L_clzsi2 +// int __clzsi2(int) +// Counts leading zero bits in $r0. +// Returns the result in $r0. +// Uses $r2 and possibly $r3 as scratch space. +WEAK_START_SECTION clzsi2 .text.sorted.libgcc.clz2.clzsi2 + CFI_START_FUNCTION + +#else /* L_clzdi2 */ +FUNC_ENTRY clzsi2 + +#endif + + // Assume all the bits in the argument are zero + movs r2, #32 + +#ifdef L_clzsi2 + WEAK_ENTRY internal_clzsi2 +#else /* L_clzdi2 */ + FUNC_ENTRY internal_clzsi2 +#endif + + // Size optimized: 22 bytes, 51 cycles + // Speed optimized: 50 bytes, 20 cycles + + #if defined(__OPTIMIZE_SIZE__) && __OPTIMIZE_SIZE__ + + // Binary search starts at half the word width. + movs r3, #16 + + LLSYM(__clz_loop): + // Test the upper 'n' bits of the operand for ZERO. + movs r1, r0 + lsrs r1, r3 + + // When the test fails, discard the lower bits of the register, + // and deduct the count of discarded bits from the result. + #ifdef __HAVE_FEATURE_IT + do_it ne, t + #else + beq LLSYM(__clz_skip) + #endif + + IT(mov,ne) r0, r1 + IT(sub,ne) r2, r3 + + LLSYM(__clz_skip): + // Decrease the shift distance for the next test. + lsrs r3, #1 + bne LLSYM(__clz_loop) + + #else /* __OPTIMIZE_SIZE__ */ + + // Unrolled binary search. + lsrs r1, r0, #16 + + #ifdef __HAVE_FEATURE_IT + do_it ne,t + #else + beq LLSYM(__clz8) + #endif + + // Out of 32 bits, the first '1' is somewhere in the highest 16, + // so the lower 16 bits are no longer interesting. + IT(mov,ne) r0, r1 + IT(sub,ne) r2, #16 + + LLSYM(__clz8): + lsrs r1, r0, #8 + + // Out of 16 bits, the first '1' is somewhere in the highest 8, + // so the lower 8 bits are no longer interesting. + #ifdef __HAVE_FEATURE_IT + do_it ne,t + #else + beq LLSYM(__clz4) + #endif + + // Out of 8 bits, the first '1' is somewhere in the highest 4, + // so the lower 4 bits are no longer interesting. + IT(mov,ne) r0, r1 + IT(sub,ne) r2, #8 + + LLSYM(__clz4): + lsrs r1, r0, #4 + + #ifdef __HAVE_FEATURE_IT + do_it ne,t + #else + beq LLSYM(__clz2) + #endif + + IT(mov,ne) r0, r1 + IT(sub,ne) r2, #4 + + LLSYM(__clz2): + // Load the remainder by index + adr r1, LLSYM(__clz_remainder) + ldrb r0, [r1, r0] + + #endif /* !__OPTIMIZE_SIZE__ */ + + // Account for the remainder. + subs r0, r2, r0 + RET + + #if !defined(__OPTIMIZE_SIZE__) || !__OPTIMIZE_SIZE__ + .align 2 + LLSYM(__clz_remainder): + .byte 0,1,2,2,3,3,3,3,4,4,4,4,4,4,4,4 + #endif + + CFI_END_FUNCTION +FUNC_END clzsi2 + +#ifdef L_clzdi2 +FUNC_END clzdi2 +#endif + +#endif /* L_clzsi2 || L_clzdi2 */ + +#endif /* !__ARM_FEATURE_CLZ */ + diff --git a/libgcc/config/arm/t-elf b/libgcc/config/arm/t-elf index 93ea1cd8f76..af779afa0a9 100644 --- a/libgcc/config/arm/t-elf +++ b/libgcc/config/arm/t-elf @@ -19,13 +19,18 @@ endif # !__symbian__ # assembly implementation here for double-precision values. +# Group 0: WEAK overridable function objects. +# See respective sources for rationale. +LIB1ASMFUNCS += \ + _clzsi2 \ + + # Group 1: Integer function objects. LIB1ASMFUNCS += \ _ashldi3 \ _ashrdi3 \ _lshrdi3 \ _clzdi2 \ - _clzsi2 \ _ctzsi2 \ _dvmd_tls \ _divsi3 \ From patchwork Fri Jan 15 11:30:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426905 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=F1NRqBxH; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=fHiR7aAX; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJtY3dGLz9sSC for ; Fri, 15 Jan 2021 22:32:05 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DB4D439730E6; Fri, 15 Jan 2021 11:31:29 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id 1BE0039730E6 for ; Fri, 15 Jan 2021 11:31:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 1BE0039730E6 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 34E51EBC; Fri, 15 Jan 2021 06:31:25 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 15 Jan 2021 06:31:25 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=efe4RqMsDFOkQ HJmwCK4BJcGHpnVhrI0qhsPZGq06/8=; b=F1NRqBxHykaA2E7tUZGWWBRmRq9Qe u0s+nGj0zPHlztB/eVqydQkJTCrGHbXRuv1kUnvuwTaC61K9/7WSJqzrfh330KVA Cx12jq4AsvxgbhnsAcc9luGUuM0y3s4ZXgX4Pnlgcx13Vt/bgaDS4MoxzZk9OgCI BGWa8ZY59ZIgfJVqPjjibN8st/5kbB3o8rJlUTuL7Ygu8rlVpdAi7+DdkWj5tm6/ e5LkfoX2UnkNAZGfYu5CjjOqWRbS4KO3ubzsKlf/NIgwxZhsHKYEuqp6hBujyKop whkloDVjeOBwyteZs8FYs9dr40NH7dfuXGOWdl2uznY+nnuDj8Rjp+zMg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=efe4RqMsDFOkQHJmwCK4BJcGHpnVhrI0qhsPZGq06/8=; b=fHiR7aAX 1K7oQHKDHx8qP8ILN9SEhtwKTyrp8jJWrz54mzxFv8FeCARezzVHBiNa7kjzeY3m JPgDuruC8xpZa4e1BXfGgbaqbcx2C2lcv2cnmtVE17EGYN4r0mK/uZ2Si3T1/oZq CFWyYy1u4/ugs4z2pP+0TbwI5qrZ5+SqqgoWDhogqdmFefaLTp1ZKz6OkRacjsSF 4fO7Jgfg2LWA8q9krOpt7/gC6FCPUw7dgoyDI80knPMiAUyfKLiC/ab9eXRcryJf kuaa2MxxJ/hrOQCMsv1sSDGrRPitf1r22UdGNwu3hq+lya2H/LUCtI8ZOkl99gYx x4hVjI3p65K3vQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddtgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpedvgfegvdehhffgveejjeehtdelie ffffeikeeifffgveefjeekfeevffekudfgieenucffohhmrghinheptghtiidvrdhssgdp ghhnuhdrohhrghenucfkphepjedurdefiedruddttddrvddvtdenucevlhhushhtvghruf hiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehgnhhusegurghnihgvlhgvnhhg vghlrdgtohhm X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id 5291E1080057; Fri, 15 Jan 2021 06:31:24 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBVNhY023721; Fri, 15 Jan 2021 03:31:23 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 10/33] Import 'ctz' functions from the CM0 library Date: Fri, 15 Jan 2021 03:30:38 -0800 Message-Id: <1f4c853d9f0d313919f2e2c85f1f9c1bce795b0a.1610709584.git.gnu@danielengel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-13.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SCC_5_SHORT_WORD_LINES, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" This version combines __ctzdi2() with __ctzsi2() into a single object with an efficient tail call. The former implementation of __ctzdi2() was in C. On architectures without __ARM_FEATURE_CLZ, this version merges the formerly separate Thumb and ARM code sequences into a unified instruction sequence. This change significantly improves Thumb performance without affecting ARM performance. Finally, this version adds a new __OPTIMIZE_SIZE__ build option. On architectures with __ARM_FEATURE_CLZ, __ctzsi2(0) now returns 32. Formerly, __ctzsi2(0) would return -1. Architectures without __ARM_FEATURE_CLZ have always returned 32, so this change makes the return value consistent. This change costs 2 extra instructions (branchless). Likewise on architectures with __ARM_FEATURE_CLZ, __ctzdi2(0) now returns 64 instead of 31. gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/bits/ctz2.S (__ctzdi2): Added a new function. (__clzsi2): Reduced size on architectures without __ARM_FEATURE_CLZ; changed so __clzsi2(0)=32 on architectures wtih __ARM_FEATURE_CLZ. * config/arm/t-elf (LIB1ASMFUNCS): Added _ctzdi2; moved _ctzsi2 to the weak function objects group. --- libgcc/config/arm/ctz2.S | 307 +++++++++++++++++++++++++++++---------- libgcc/config/arm/t-elf | 3 +- 2 files changed, 232 insertions(+), 78 deletions(-) diff --git a/libgcc/config/arm/ctz2.S b/libgcc/config/arm/ctz2.S index 8702c9afb94..ee6df6d6d01 100644 --- a/libgcc/config/arm/ctz2.S +++ b/libgcc/config/arm/ctz2.S @@ -1,86 +1,239 @@ -/* Copyright (C) 1995-2021 Free Software Foundation, Inc. +/* ctz2.S: ARM optimized 'ctz' functions -This file is free software; you can redistribute it and/or modify it -under the terms of the GNU General Public License as published by the -Free Software Foundation; either version 3, or (at your option) any -later version. + Copyright (C) 2020-2021 Free Software Foundation, Inc. + Contributed by Daniel Engel (gnu@danielengel.com) -This file is distributed in the hope that it will be useful, but -WITHOUT ANY WARRANTY; without even the implied warranty of -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU -General Public License for more details. + This file is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the + Free Software Foundation; either version 3, or (at your option) any + later version. -Under Section 7 of GPL version 3, you are granted additional -permissions described in the GCC Runtime Library Exception, version -3.1, as published by the Free Software Foundation. + This file is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. -You should have received a copy of the GNU General Public License and -a copy of the GCC Runtime Library Exception along with this program; -see the files COPYING3 and COPYING.RUNTIME respectively. If not, see -. */ + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ -#ifdef L_ctzsi2 -#ifdef NOT_ISA_TARGET_32BIT -FUNC_START ctzsi2 - negs r1, r0 - ands r0, r0, r1 - movs r1, #28 - movs r3, #1 - lsls r3, r3, #16 - cmp r0, r3 /* 0x10000 */ - bcc 2f - lsrs r0, r0, #16 - subs r1, r1, #16 -2: lsrs r3, r3, #8 - cmp r0, r3 /* #0x100 */ - bcc 2f - lsrs r0, r0, #8 - subs r1, r1, #8 -2: lsrs r3, r3, #4 - cmp r0, r3 /* #0x10 */ - bcc 2f - lsrs r0, r0, #4 - subs r1, r1, #4 -2: adr r2, 1f - ldrb r0, [r2, r0] - subs r0, r0, r1 - bx lr -.align 2 -1: -.byte 27, 28, 29, 29, 30, 30, 30, 30, 31, 31, 31, 31, 31, 31, 31, 31 - FUNC_END ctzsi2 + +// When the hardware 'ctz' function is available, an efficient version +// of __ctzsi2(x) can be created by calculating '31 - __ctzsi2(lsb(x))', +// where lsb(x) is 'x' with only the least-significant '1' bit set. +// The following offset applies to all of the functions in this file. +#if defined(__ARM_FEATURE_CLZ) && __ARM_FEATURE_CLZ + #define CTZ_RESULT_OFFSET 1 #else -ARM_FUNC_START ctzsi2 - rsb r1, r0, #0 - and r0, r0, r1 -# if defined (__ARM_FEATURE_CLZ) - clz r0, r0 - rsb r0, r0, #31 - RET -# else - mov r1, #28 - cmp r0, #0x10000 - do_it cs, t - movcs r0, r0, lsr #16 - subcs r1, r1, #16 - cmp r0, #0x100 - do_it cs, t - movcs r0, r0, lsr #8 - subcs r1, r1, #8 - cmp r0, #0x10 - do_it cs, t - movcs r0, r0, lsr #4 - subcs r1, r1, #4 - adr r2, 1f - ldrb r0, [r2, r0] - sub r0, r0, r1 - RET -.align 2 -1: -.byte 27, 28, 29, 29, 30, 30, 30, 30, 31, 31, 31, 31, 31, 31, 31, 31 -# endif /* !defined (__ARM_FEATURE_CLZ) */ - FUNC_END ctzsi2 + #define CTZ_RESULT_OFFSET 0 +#endif + + +#ifdef L_ctzdi2 + +// int __ctzdi2(long long) +// Counts trailing zeros in a 64 bit double word. +// Expects the argument in $r1:$r0. +// Returns the result in $r0. +// Uses $r2 and possibly $r3 as scratch space. +FUNC_START_SECTION ctzdi2 .text.sorted.libgcc.ctz2.ctzdi2 + CFI_START_FUNCTION + + #if defined(__ARMEB__) && __ARMEB__ + // Assume all the bits in the argument are zero. + movs r2, #(64 - CTZ_RESULT_OFFSET) + + // Check if the lower word is zero. + cmp r1, #0 + + // The lower word is zero, so calculate 32 + __ctzsi2(upper). + beq SYM(__internal_ctzsi2) + + // The lower word is non-zero, so set up __ctzsi2(lower). + // Then fall through. + movs r0, r1 + + #else /* !__ARMEB__ */ + // Check if the lower word is zero. + cmp r0, #0 + + // If the lower word is non-zero, result is just __ctzsi2(lower). + bne SYM(__ctzsi2) + + // The lower word is zero, so calculate 32 + __ctzsi2(upper). + movs r2, #(64 - CTZ_RESULT_OFFSET) + movs r0, r1 + b SYM(__internal_ctzsi2) + + #endif /* !__ARMEB__ */ + +#endif /* L_ctzdi2 */ + + +// The bitwise implementation of __ctzdi2() tightly couples with __ctzsi2(), +// such that instructions must appear consecutively in the same memory +// section for proper flow control. However, this construction inhibits +// the ability to discard __ctzdi2() when only using __ctzsi2(). +// Therefore, this block configures __ctzsi2() for compilation twice. +// The first version is a minimal standalone implementation, and the second +// version is the continuation of __ctzdi2(). The standalone version must +// be declared WEAK, so that the combined version can supersede it and +// provide both symbols when required. +// '_ctzsi2' should appear before '_ctzdi2' in LIB1ASMFUNCS. +#if defined(L_ctzsi2) || defined(L_ctzdi2) + +#ifdef L_ctzsi2 +// int __ctzsi2(int) +// Counts trailing zeros in a 32 bit word. +// Expects the argument in $r0. +// Returns the result in $r0. +// Uses $r2 and possibly $r3 as scratch space. +WEAK_START_SECTION ctzsi2 .text.sorted.libgcc.ctz2.ctzdi2 + CFI_START_FUNCTION + +#else /* L_ctzdi2 */ +FUNC_ENTRY ctzsi2 + #endif -#endif /* L_clzsi2 */ + + // Assume all the bits in the argument are zero + movs r2, #(32 - CTZ_RESULT_OFFSET) + +#ifdef L_ctzsi2 + WEAK_ENTRY internal_ctzsi2 +#else /* L_ctzdi2 */ + FUNC_ENTRY internal_ctzsi2 +#endif + + #if defined(__ARM_FEATURE_CLZ) && __ARM_FEATURE_CLZ + + // Find the least-significant '1' bit of the argument. + rsbs r1, r0, #0 + ands r1, r0 + + // Maintain result compatibility with the software implementation. + // Technically, __ctzsi2(0) is undefined, but 32 seems better than -1. + // (or possibly 31 if this is an intermediate result for __ctzdi2(0)). + // The carry flag from 'rsbs' gives '-1' iff the argument was 'zero'. + // (NOTE: 'ands' with 0 shift bits does not change the carry flag.) + // After the jump, the final result will be '31 - (-1)'. + sbcs r0, r0 + + #ifdef __HAVE_FEATURE_IT + do_it ne + #else + beq LLSYM(__ctz_zero) + #endif + + // Gives the number of '0' bits left of the least-significant '1'. + IT(clz,ne) r0, r1 + + #elif defined(__OPTIMIZE_SIZE__) && __OPTIMIZE_SIZE__ + // Size optimized: 24 bytes, 52 cycles + // Speed optimized: 52 bytes, 21 cycles + + // Binary search starts at half the word width. + movs r3, #16 + + LLSYM(__ctz_loop): + // Test the upper 'n' bits of the operand for ZERO. + movs r1, r0 + lsls r1, r3 + + // When the test fails, discard the lower bits of the register, + // and deduct the count of discarded bits from the result. + #ifdef __HAVE_FEATURE_IT + do_it ne, t + #else + beq LLSYM(__ctz_skip) + #endif + + IT(mov,ne) r0, r1 + IT(sub,ne) r2, r3 + + LLSYM(__ctz_skip): + // Decrease the shift distance for the next test. + lsrs r3, #1 + bne LLSYM(__ctz_loop) + + // Prepare the remainder. + lsrs r0, #31 + + #else /* !__OPTIMIZE_SIZE__ */ + + // Unrolled binary search. + lsls r1, r0, #16 + + #ifdef __HAVE_FEATURE_IT + do_it ne, t + #else + beq LLSYM(__ctz8) + #endif + + // Out of 32 bits, the first '1' is somewhere in the lowest 16, + // so the higher 16 bits are no longer interesting. + IT(mov,ne) r0, r1 + IT(sub,ne) r2, #16 + + LLSYM(__ctz8): + lsls r1, r0, #8 + + #ifdef __HAVE_FEATURE_IT + do_it ne, t + #else + beq LLSYM(__ctz4) + #endif + + // Out of 16 bits, the first '1' is somewhere in the lowest 8, + // so the higher 8 bits are no longer interesting. + IT(mov,ne) r0, r1 + IT(sub,ne) r2, #8 + + LLSYM(__ctz4): + lsls r1, r0, #4 + + #ifdef __HAVE_FEATURE_IT + do_it ne, t + #else + beq LLSYM(__ctz2) + #endif + + // Out of 8 bits, the first '1' is somewhere in the lowest 4, + // so the higher 4 bits are no longer interesting. + IT(mov,ne) r0, r1 + IT(sub,ne) r2, #4 + + LLSYM(__ctz2): + // Look up the remainder by index. + lsrs r0, #28 + adr r3, LLSYM(__ctz_remainder) + ldrb r0, [r3, r0] + + #endif /* !__OPTIMIZE_SIZE__ */ + + LLSYM(__ctz_zero): + // Apply the remainder. + subs r0, r2, r0 + RET + + #if (!defined(__ARM_FEATURE_CLZ) || !__ARM_FEATURE_CLZ) && \ + (!defined(__OPTIMIZE_SIZE__) || !__OPTIMIZE_SIZE__) + .align 2 + LLSYM(__ctz_remainder): + .byte 0,4,3,4,2,4,3,4,1,4,3,4,2,4,3,4 + #endif + + CFI_END_FUNCTION +FUNC_END ctzsi2 + +#ifdef L_ctzdi2 +FUNC_END ctzdi2 +#endif + +#endif /* L_ctzsi2 || L_ctzdi2 */ diff --git a/libgcc/config/arm/t-elf b/libgcc/config/arm/t-elf index af779afa0a9..33b83ac4adf 100644 --- a/libgcc/config/arm/t-elf +++ b/libgcc/config/arm/t-elf @@ -23,6 +23,7 @@ endif # !__symbian__ # See respective sources for rationale. LIB1ASMFUNCS += \ _clzsi2 \ + _ctzsi2 \ # Group 1: Integer function objects. @@ -31,7 +32,7 @@ LIB1ASMFUNCS += \ _ashrdi3 \ _lshrdi3 \ _clzdi2 \ - _ctzsi2 \ + _ctzdi2 \ _dvmd_tls \ _divsi3 \ _modsi3 \ From patchwork Fri Jan 15 11:30:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426906 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=wxTuqpNa; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=Fa45Vi9J; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJtc6f5Jz9sSC for ; Fri, 15 Jan 2021 22:32:08 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BFE0B39730EF; Fri, 15 Jan 2021 11:31:31 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id 1F46A39730EA for ; Fri, 15 Jan 2021 11:31:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 1F46A39730EA Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 35F1FF40; Fri, 15 Jan 2021 06:31:27 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 15 Jan 2021 06:31:27 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=4islhfmFPqSEu iNNol8woCa7Olx2Er0ANknyakmXWdI=; b=wxTuqpNaxz0OMj5mZXSn8WzGzEB7x 0rnZMsgPR+uprS7ssHKt7CtSj0fqynQZb8nyS2xVDgzAAEjSsz1KpLKfFvyBebTR KEEOR9g5Wu/+LiuSAkZeLv8I+xUfTEC+0VZ1U+RsIQ6DtcUc1t+c/pAQVWiuRiV9 iB24L3vFbyHZzpr9sZSt6cODa5XF11xwUFwRzjbWV1U4yCdb5twZYyqNwSopJgTP 8mQga/TtSRvWC7Uo6+68OrxkgPzbsDHA71I8lKiSW89FnqsC9yHWLXM475UVCIRy NgUW7UkYeaeAzvpyB/zi9oYGg3SBEyZieflXEs0Ckn1izEbxDvhUfcjOw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=4islhfmFPqSEuiNNol8woCa7Olx2Er0ANknyakmXWdI=; b=Fa45Vi9J oi1TPcX9QF0tV+XNi1rN2IuxN+yIv61R5JY4H5dUFEEQ/eTcpD8CV35Gt8YXH/JW eIYjh8reDPqKihm2lO72jvv4kK3dRRAoEjghli5FjblKwsHx7c615c2icjr2PoU5 9vARXobNscrrSeKuLi2EyF0q4TeaO5UL1XtbnzFPpKdwTAkBINutxPZYTQEUi1sO lFViiCKJRMbeu8tIMgZuFqEh41hqeNE+GDTFNq81uCY0YRjtUFzVP326iMM1SPzU 92h1TC5YgItZLUX4HeXFtwTGmUP0IGUGcv3WBQqZ+kg+E/mICX8OaJtqeQWJN2Dd kjrmOmgzw8G2Cw== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddtgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpeeugefhkeeguedvjedvgeduffejgf dutdeuhfdtfefggeetteefteefvdfftdevveenucffohhmrghinheplhhshhhifhhtrdhs sgdpghhnuhdrohhrghenucfkphepjedurdefiedruddttddrvddvtdenucevlhhushhtvg hrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhm X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id 74EC31080057; Fri, 15 Jan 2021 06:31:26 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBVP5H023724; Fri, 15 Jan 2021 03:31:25 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 11/33] Import 64-bit shift functions from the CM0 library Date: Fri, 15 Jan 2021 03:30:39 -0800 Message-Id: <8d025ab5ba947e552a204e7df511cd2dab73c880.1610709584.git.gnu@danielengel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SCC_10_SHORT_WORD_LINES, SCC_5_SHORT_WORD_LINES, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" The Thumb versions of these functions are each 1-2 instructions smaller and faster, and branchless when the IT instruction is available. The ARM versions were converted to the "xxl/xxh" big-endian register naming convention, but are otherwise unchanged. gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/bits/shift.S (__ashldi3, __ashrdi3, __lshldi3): Reduced code size on Thumb architectures; updated big-endian register naming convention to "xxl/xxh". --- libgcc/config/arm/eabi/lshift.S | 338 +++++++++++++++++++++----------- 1 file changed, 228 insertions(+), 110 deletions(-) diff --git a/libgcc/config/arm/eabi/lshift.S b/libgcc/config/arm/eabi/lshift.S index 0974a72c377..16cf2dcef04 100644 --- a/libgcc/config/arm/eabi/lshift.S +++ b/libgcc/config/arm/eabi/lshift.S @@ -1,123 +1,241 @@ -/* Copyright (C) 1995-2021 Free Software Foundation, Inc. +/* lshift.S: ARM optimized 64-bit integer shift -This file is free software; you can redistribute it and/or modify it -under the terms of the GNU General Public License as published by the -Free Software Foundation; either version 3, or (at your option) any -later version. + Copyright (C) 2018-2021 Free Software Foundation, Inc. + Contributed by Daniel Engel, Senva Inc (gnu@danielengel.com) -This file is distributed in the hope that it will be useful, but -WITHOUT ANY WARRANTY; without even the implied warranty of -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU -General Public License for more details. + This file is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the + Free Software Foundation; either version 3, or (at your option) any + later version. -Under Section 7 of GPL version 3, you are granted additional -permissions described in the GCC Runtime Library Exception, version -3.1, as published by the Free Software Foundation. + This file is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. -You should have received a copy of the GNU General Public License and -a copy of the GCC Runtime Library Exception along with this program; -see the files COPYING3 and COPYING.RUNTIME respectively. If not, see -. */ + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ #ifdef L_lshrdi3 - FUNC_START lshrdi3 - FUNC_ALIAS aeabi_llsr lshrdi3 - -#ifdef __thumb__ - lsrs al, r2 - movs r3, ah - lsrs ah, r2 - mov ip, r3 - subs r2, #32 - lsrs r3, r2 - orrs al, r3 - negs r2, r2 - mov r3, ip - lsls r3, r2 - orrs al, r3 - RET -#else - subs r3, r2, #32 - rsb ip, r2, #32 - movmi al, al, lsr r2 - movpl al, ah, lsr r3 - orrmi al, al, ah, lsl ip - mov ah, ah, lsr r2 - RET -#endif - FUNC_END aeabi_llsr - FUNC_END lshrdi3 - -#endif - +// long long __aeabi_llsr(long long, int) +// Logical shift right the 64 bit value in $r1:$r0 by the count in $r2. +// The result is only guaranteed for shifts in the range of '0' to '63'. +// Uses $r3 as scratch space. +FUNC_START_SECTION aeabi_llsr .text.sorted.libgcc.lshrdi3 +FUNC_ALIAS lshrdi3 aeabi_llsr + CFI_START_FUNCTION + + #if defined(__thumb__) && __thumb__ + + // Save a copy for the remainder. + movs r3, xxh + + // Assume a simple shift. + lsrs xxl, r2 + lsrs xxh, r2 + + // Test if the shift distance is larger than 1 word. + subs r2, #32 + + #ifdef __HAVE_FEATURE_IT + do_it lo,te + + // The remainder is opposite the main shift, (32 - x) bits. + rsblo r2, #0 + lsllo r3, r2 + + // The remainder shift extends into the hi word. + lsrhs r3, r2 + + #else /* !__HAVE_FEATURE_IT */ + bhs LLSYM(__llsr_large) + + // The remainder is opposite the main shift, (32 - x) bits. + rsbs r2, #0 + lsls r3, r2 + + // Cancel any remaining shift. + eors r2, r2 + + LLSYM(__llsr_large): + // Apply any remaining shift to the hi word. + lsrs r3, r2 + + #endif /* !__HAVE_FEATURE_IT */ + + // Merge remainder and result. + adds xxl, r3 + RET + + #else /* !__thumb__ */ + + subs r3, r2, #32 + rsb ip, r2, #32 + movmi xxl, xxl, lsr r2 + movpl xxl, xxh, lsr r3 + orrmi xxl, xxl, xxh, lsl ip + mov xxh, xxh, lsr r2 + RET + + #endif /* !__thumb__ */ + + + CFI_END_FUNCTION +FUNC_END lshrdi3 +FUNC_END aeabi_llsr + +#endif /* L_lshrdi3 */ + + #ifdef L_ashrdi3 - - FUNC_START ashrdi3 - FUNC_ALIAS aeabi_lasr ashrdi3 - -#ifdef __thumb__ - lsrs al, r2 - movs r3, ah - asrs ah, r2 - subs r2, #32 - @ If r2 is negative at this point the following step would OR - @ the sign bit into all of AL. That's not what we want... - bmi 1f - mov ip, r3 - asrs r3, r2 - orrs al, r3 - mov r3, ip -1: - negs r2, r2 - lsls r3, r2 - orrs al, r3 - RET -#else - subs r3, r2, #32 - rsb ip, r2, #32 - movmi al, al, lsr r2 - movpl al, ah, asr r3 - orrmi al, al, ah, lsl ip - mov ah, ah, asr r2 - RET -#endif - - FUNC_END aeabi_lasr - FUNC_END ashrdi3 - -#endif + +// long long __aeabi_lasr(long long, int) +// Arithmetic shift right the 64 bit value in $r1:$r0 by the count in $r2. +// The result is only guaranteed for shifts in the range of '0' to '63'. +// Uses $r3 as scratch space. +FUNC_START_SECTION aeabi_lasr .text.sorted.libgcc.ashrdi3 +FUNC_ALIAS ashrdi3 aeabi_lasr + CFI_START_FUNCTION + + #if defined(__thumb__) && __thumb__ + + // Save a copy for the remainder. + movs r3, xxh + + // Assume a simple shift. + lsrs xxl, r2 + asrs xxh, r2 + + // Test if the shift distance is larger than 1 word. + subs r2, #32 + + #ifdef __HAVE_FEATURE_IT + do_it lo,te + + // The remainder is opposite the main shift, (32 - x) bits. + rsblo r2, #0 + lsllo r3, r2 + + // The remainder shift extends into the hi word. + asrhs r3, r2 + + #else /* !__HAVE_FEATURE_IT */ + bhs LLSYM(__lasr_large) + + // The remainder is opposite the main shift, (32 - x) bits. + rsbs r2, #0 + lsls r3, r2 + + // Cancel any remaining shift. + eors r2, r2 + + LLSYM(__lasr_large): + // Apply any remaining shift to the hi word. + asrs r3, r2 + + #endif /* !__HAVE_FEATURE_IT */ + + // Merge remainder and result. + adds xxl, r3 + RET + + #else /* !__thumb__ */ + + subs r3, r2, #32 + rsb ip, r2, #32 + movmi xxl, xxl, lsr r2 + movpl xxl, xxh, asr r3 + orrmi xxl, xxl, xxh, lsl ip + mov xxh, xxh, asr r2 + RET + + #endif /* !__thumb__ */ + + CFI_END_FUNCTION +FUNC_END ashrdi3 +FUNC_END aeabi_lasr + +#endif /* L_ashrdi3 */ + #ifdef L_ashldi3 - FUNC_START ashldi3 - FUNC_ALIAS aeabi_llsl ashldi3 - -#ifdef __thumb__ - lsls ah, r2 - movs r3, al - lsls al, r2 - mov ip, r3 - subs r2, #32 - lsls r3, r2 - orrs ah, r3 - negs r2, r2 - mov r3, ip - lsrs r3, r2 - orrs ah, r3 - RET -#else - subs r3, r2, #32 - rsb ip, r2, #32 - movmi ah, ah, lsl r2 - movpl ah, al, lsl r3 - orrmi ah, ah, al, lsr ip - mov al, al, lsl r2 - RET -#endif - FUNC_END aeabi_llsl - FUNC_END ashldi3 - -#endif +// long long __aeabi_llsl(long long, int) +// Logical shift left the 64 bit value in $r1:$r0 by the count in $r2. +// The result is only guaranteed for shifts in the range of '0' to '63'. +// Uses $r3 as scratch space. +.section .text.sorted.libgcc.ashldi3,"x" +FUNC_START_SECTION aeabi_llsl .text.sorted.libgcc.ashldi3 +FUNC_ALIAS ashldi3 aeabi_llsl + CFI_START_FUNCTION + + #if defined(__thumb__) && __thumb__ + + // Save a copy for the remainder. + movs r3, xxl + + // Assume a simple shift. + lsls xxl, r2 + lsls xxh, r2 + + // Test if the shift distance is larger than 1 word. + subs r2, #32 + + #ifdef __HAVE_FEATURE_IT + do_it lo,te + + // The remainder is opposite the main shift, (32 - x) bits. + rsblo r2, #0 + lsrlo r3, r2 + + // The remainder shift extends into the hi word. + lslhs r3, r2 + + #else /* !__HAVE_FEATURE_IT */ + bhs LLSYM(__llsl_large) + + // The remainder is opposite the main shift, (32 - x) bits. + rsbs r2, #0 + lsrs r3, r2 + + // Cancel any remaining shift. + eors r2, r2 + + LLSYM(__llsl_large): + // Apply any remaining shift to the hi word. + lsls r3, r2 + + #endif /* !__HAVE_FEATURE_IT */ + + // Merge remainder and result. + adds xxh, r3 + RET + + #else /* !__thumb__ */ + + subs r3, r2, #32 + rsb ip, r2, #32 + movmi xxh, xxh, lsl r2 + movpl xxh, xxl, lsl r3 + orrmi xxh, xxh, xxl, lsr ip + mov xxl, xxl, lsl r2 + RET + + #endif /* !__thumb__ */ + + CFI_END_FUNCTION +FUNC_END ashldi3 +FUNC_END aeabi_llsl + +#endif /* L_ashldi3 */ + + From patchwork Fri Jan 15 11:30:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426907 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=mAOL51C7; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=Cg+DeC7G; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJtj36QZz9sSC for ; Fri, 15 Jan 2021 22:32:13 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 30FA839730F3; Fri, 15 Jan 2021 11:31:32 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id 2E59639730ED for ; Fri, 15 Jan 2021 11:31:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 2E59639730ED Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 443DFEBC; Fri, 15 Jan 2021 06:31:29 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 15 Jan 2021 06:31:29 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=WHcJ1rS61s1ew V4afmg4Jvs/HfZoP2D4nuld6xNPM/s=; b=mAOL51C7b0IHrGbi4VdgQIp7w7X4t 2lN0sRPKWxpsV5k/G4Yht4wEucHjBulkD2kVHHnE6IluIIbGFBqMo9hiMvCNXlbg vfQvQcwDVLhsYTsklcD9sLBZepcrTS6y5YDaBJ4lqsIqyCK5tuVTBEtfwkELbmyu cQmxQhyCWG+xG5lzCihAMhK8Y7ZuJ/8w77p7O72N3OKn3ozSvotnXt9RQyfG36iF R7Ueq8px79n4YYYoFVhcFwUz+94XoeM3mGKXgyUDRVg72E6xUH/rSLfOMA+7oKdq 6eq3TTDK+2PqBkCwv8A5b8eZyLJshpwWde9i4EHAfrjDZdYHccCVdRm5w== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=WHcJ1rS61s1ewV4afmg4Jvs/HfZoP2D4nuld6xNPM/s=; b=Cg+DeC7G 2FoK+iSR56Aki0LZbBvztWzJNm1PErsGEubeLo8M1GmE//m76SV2zj6OokoOmU3E 0CAWdaZzudefyVwm1d5/SFrXG8kWqSp0WW2RyN+EMWvKk3zaa7XHGaIs0fD3ldp0 +28FS7WBGT7RNaMuJ+9pLxEHC9WaKTqZ8ZCW7ReDpH1Zwiu37uEOJSC7siowG/c4 Py6K9/6dadYrZ0uSPjfO2HXSvozT5a+rUKHyvOtEXsmIAEPzLW1buy8Bs/DAbscF 8wLiB+ALPQBBUeOoG32hrGesu4W7+pZNoGNOKvBGM7PwCChuWVDDSpo1qS2FEmu3 HO7LeBdUe70P3w== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddtgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpefhkeefgffhhefhuedviedthfevte eigeekfffhieeludduudetffekudejiedtieenucffohhmrghinheptghliidvrdhssgdp ghhnuhdrohhrghenucfkphepjedurdefiedruddttddrvddvtdenucevlhhushhtvghruf hiiigvpedunecurfgrrhgrmhepmhgrihhlfhhrohhmpehgnhhusegurghnihgvlhgvnhhg vghlrdgtohhm X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id 960D81080057; Fri, 15 Jan 2021 06:31:28 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBVR9M023727; Fri, 15 Jan 2021 03:31:27 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 12/33] Import 'clrsb' functions from the CM0 library Date: Fri, 15 Jan 2021 03:30:40 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-13.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" This implementation provides an efficient tail call to __clzsi2(), making the functions rather smaller and faster than the C versions. gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/bits/clz2.S (__clrsbsi2, __clrsbdi2): Added new functions. * config/arm/t-elf (LIB1ASMFUNCS): Added new function objects _clrsbsi2 and _clrsbdi2). --- libgcc/config/arm/clz2.S | 108 ++++++++++++++++++++++++++++++++++++++- libgcc/config/arm/t-elf | 2 + 2 files changed, 108 insertions(+), 2 deletions(-) diff --git a/libgcc/config/arm/clz2.S b/libgcc/config/arm/clz2.S index dc246708a82..5f608c0c2a3 100644 --- a/libgcc/config/arm/clz2.S +++ b/libgcc/config/arm/clz2.S @@ -1,4 +1,4 @@ -/* clz2.S: Cortex M0 optimized 'clz' functions +/* clz2.S: ARM optimized 'clz' and related functions Copyright (C) 2018-2021 Free Software Foundation, Inc. Contributed by Daniel Engel (gnu@danielengel.com) @@ -23,7 +23,7 @@ . */ -#if defined(__ARM_FEATURE_CLZ) && __ARM_FEATURE_CLZ +#ifdef __ARM_FEATURE_CLZ #ifdef L_clzdi2 @@ -241,3 +241,107 @@ FUNC_END clzdi2 #endif /* !__ARM_FEATURE_CLZ */ + +#ifdef L_clrsbdi2 + +// int __clrsbdi2(int) +// Counts the number of "redundant sign bits" in $r1:$r0. +// Returns the result in $r0. +// Uses $r2 and $r3 as scratch space. +FUNC_START_SECTION clrsbdi2 .text.sorted.libgcc.clz2.clrsbdi2 + CFI_START_FUNCTION + + #if defined(__ARM_FEATURE_CLZ) && __ARM_FEATURE_CLZ + // Invert negative signs to keep counting zeros. + asrs r3, xxh, #31 + eors xxl, r3 + eors xxh, r3 + + // Same as __clzdi2(), except that the 'C' flag is pre-calculated. + // Also, the trailing 'subs', since the last bit is not redundant. + do_it eq, et + clzeq r0, xxl + clzne r0, xxh + addeq r0, #32 + subs r0, #1 + RET + + #else /* !__ARM_FEATURE_CLZ */ + // Result if all the bits in the argument are zero. + // Set it here to keep the flags clean after 'eors' below. + movs r2, #31 + + // Invert negative signs to keep counting zeros. + asrs r3, xxh, #31 + eors xxh, r3 + + #if defined(__ARMEB__) && __ARMEB__ + // If the upper word is non-zero, return '__clzsi2(upper) - 1'. + bne SYM(__internal_clzsi2) + + // The upper word is zero, prepare the lower word. + movs r0, r1 + eors r0, r3 + + #else /* !__ARMEB__ */ + // Save the lower word temporarily. + // This somewhat awkward construction adds one cycle when the + // branch is not taken, but prevents a double-branch. + eors r3, r0 + + // If the upper word is non-zero, return '__clzsi2(upper) - 1'. + movs r0, r1 + bne SYM(__internal_clzsi2) + + // Restore the lower word. + movs r0, r3 + + #endif /* !__ARMEB__ */ + + // The upper word is zero, return '31 + __clzsi2(lower)'. + adds r2, #32 + b SYM(__internal_clzsi2) + + #endif /* !__ARM_FEATURE_CLZ */ + + CFI_END_FUNCTION +FUNC_END clrsbdi2 + +#endif /* L_clrsbdi2 */ + + +#ifdef L_clrsbsi2 + +// int __clrsbsi2(int) +// Counts the number of "redundant sign bits" in $r0. +// Returns the result in $r0. +// Uses $r2 and possibly $r3 as scratch space. +FUNC_START_SECTION clrsbsi2 .text.sorted.libgcc.clz2.clrsbsi2 + CFI_START_FUNCTION + + // Invert negative signs to keep counting zeros. + asrs r2, r0, #31 + eors r0, r2 + + #if defined(__ARM_FEATURE_CLZ) && __ARM_FEATURE_CLZ + // Count. + clz r0, r0 + + // The result for a positive value will always be >= 1. + // By definition, the last bit is not redundant. + subs r0, #1 + RET + + #else /* !__ARM_FEATURE_CLZ */ + // Result if all the bits in the argument are zero. + // By definition, the last bit is not redundant. + movs r2, #31 + b SYM(__internal_clzsi2) + + #endif /* !__ARM_FEATURE_CLZ */ + + CFI_END_FUNCTION +FUNC_END clrsbsi2 + +#endif /* L_clrsbsi2 */ + diff --git a/libgcc/config/arm/t-elf b/libgcc/config/arm/t-elf index 33b83ac4adf..89071cebe45 100644 --- a/libgcc/config/arm/t-elf +++ b/libgcc/config/arm/t-elf @@ -31,6 +31,8 @@ LIB1ASMFUNCS += \ _ashldi3 \ _ashrdi3 \ _lshrdi3 \ + _clrsbsi2 \ + _clrsbdi2 \ _clzdi2 \ _ctzdi2 \ _dvmd_tls \ From patchwork Fri Jan 15 11:30:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426908 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=VQ6l7iKY; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=Rn9Hm+dI; dkim-atps=neutral Received: from sourceware.org (unknown [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJtp650zz9sVr for ; Fri, 15 Jan 2021 22:32:18 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9989139730F1; Fri, 15 Jan 2021 11:31:34 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id 5F7CD39730F6 for ; Fri, 15 Jan 2021 11:31:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 5F7CD39730F6 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 73E04F40; Fri, 15 Jan 2021 06:31:31 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 15 Jan 2021 06:31:31 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=9h/E+GE8XVtDa IChww5W1EqY3r3h7J1a3n7egmLlMbQ=; b=VQ6l7iKYX1SwI8CJpjlWcv5aBN5G/ 5cGsHPPfHHQBVCGWJ+3hF0bUfNyN0Ug1I02ZLuu8/ZoZAwOpAGkm4zpF6YqB1ZRB W9dQKHhuAnew7dm4JctVZciu2aWwUSDM8fM/k+wORpBNN8tTOaTcHoc05m+0nvbJ B0NLMbm46eEskrkYxF4woVPRi43pjM0rfvII9mg01ISD19CEFOwXlPaVVfZLf3yp fI68EWr8I3sAf7JT+fFzUr0HhTuwIJwfRLgfb0I1UqeoWnXd9Z3IWQuQ5Xdyuget 6H8txajLujLuayFE/AeW2VVd3o/D1VsMoxpbkjOcTi/i9HHrPDTeG5msg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=9h/E+GE8XVtDaIChww5W1EqY3r3h7J1a3n7egmLlMbQ=; b=Rn9Hm+dI XNZTjbElPw6J+Ff5ogeQz8RGwfUMQ8bt/iAhFFeZVegzxJJkg/5hi9KXPK2DcHjw FzrO7HIh8pSpepv7IIglcrkV7DmTHnX8QAl7xURAcnHvhpZH3t0Ze40ygNWaT408 c97/CksajTuVztIvu3xLWaix2rsTvJjKB7Lcichu/OR9xwG97d6FK11xmi1S4dSy R0SLjffQJ8vChVeT8wSfJ6Tzxlu4vVfUH3LJfLDxnzPGK1CbiOVaO1uhoxnZ9Bgh tK9uNq22hrSjG9GQTeo9yZa+hk4O0rR4Ego+8lLzm7F6Xh6h6MjkUqOdXYlNXyV4 bKQpFu6PlU8PCA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddtgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpeeifedvjeegvdejheelffegleejke egueekjeetheekledvgffhkeeujefghfegheenucffohhmrghinheptghtiidvrdhssgen ucfkphepjedurdefiedruddttddrvddvtdenucevlhhushhtvghrufhiiigvpedtnecurf grrhgrmhepmhgrihhlfhhrohhmpehgnhhusegurghnihgvlhgvnhhgvghlrdgtohhm X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id B463E1080057; Fri, 15 Jan 2021 06:31:30 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBVTSY023730; Fri, 15 Jan 2021 03:31:29 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 13/33] Import 'ffs' functions from the CM0 library Date: Fri, 15 Jan 2021 03:30:41 -0800 Message-Id: <15bee5a89e74c0799c4df98214f10aa42f1b43d4.1610709584.git.gnu@danielengel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-13.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" This implementation provides an efficient tail call to __clzdi2(), making the functions rather smaller and faster than the C versions. gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/bits/ctz2.S (__ffssi2, __ffsdi2): New functions. * config/arm/t-elf (LIB1ASMFUNCS): Added _ffssi2 and _ffsdi2. --- libgcc/config/arm/ctz2.S | 77 +++++++++++++++++++++++++++++++++++++++- libgcc/config/arm/t-elf | 2 ++ 2 files changed, 78 insertions(+), 1 deletion(-) diff --git a/libgcc/config/arm/ctz2.S b/libgcc/config/arm/ctz2.S index ee6df6d6d01..545f8f94d71 100644 --- a/libgcc/config/arm/ctz2.S +++ b/libgcc/config/arm/ctz2.S @@ -1,4 +1,4 @@ -/* ctz2.S: ARM optimized 'ctz' functions +/* ctz2.S: ARM optimized 'ctz' and related functions Copyright (C) 2020-2021 Free Software Foundation, Inc. Contributed by Daniel Engel (gnu@danielengel.com) @@ -237,3 +237,78 @@ FUNC_END ctzdi2 #endif /* L_ctzsi2 || L_ctzdi2 */ + +#ifdef L_ffsdi2 + +// int __ffsdi2(int) +// Return the index of the least significant 1-bit in $r1:r0, +// or zero if $r1:r0 is zero. The least significant bit is index 1. +// Returns the result in $r0. +// Uses $r2 and possibly $r3 as scratch space. +// Same section as __ctzsi2() for sake of the tail call branches. +FUNC_START_SECTION ffsdi2 .text.sorted.libgcc.ctz2.ffsdi2 + CFI_START_FUNCTION + + // Simplify branching by assuming a non-zero lower word. + // For all such, ffssi2(x) == ctzsi2(x) + 1. + movs r2, #(33 - CTZ_RESULT_OFFSET) + + #if defined(__ARMEB__) && __ARMEB__ + // HACK: Save the upper word in a scratch register. + movs r3, r0 + + // Test the lower word. + movs r0, r1 + bne SYM(__internal_ctzsi2) + + // Test the upper word. + movs r2, #(65 - CTZ_RESULT_OFFSET) + movs r0, r3 + bne SYM(__internal_ctzsi2) + + #else /* !__ARMEB__ */ + // Test the lower word. + cmp r0, #0 + bne SYM(__internal_ctzsi2) + + // Test the upper word. + movs r2, #(65 - CTZ_RESULT_OFFSET) + movs r0, r1 + bne SYM(__internal_ctzsi2) + + #endif /* !__ARMEB__ */ + + // Upper and lower words are both zero. + RET + + CFI_END_FUNCTION +FUNC_END ffsdi2 + +#endif /* L_ffsdi2 */ + + +#ifdef L_ffssi2 + +// int __ffssi2(int) +// Return the index of the least significant 1-bit in $r0, +// or zero if $r0 is zero. The least significant bit is index 1. +// Returns the result in $r0. +// Uses $r2 and possibly $r3 as scratch space. +// Same section as __ctzsi2() for sake of the tail call branches. +FUNC_START_SECTION ffssi2 .text.sorted.libgcc.ctz2.ffssi2 + CFI_START_FUNCTION + + // Simplify branching by assuming a non-zero argument. + // For all such, ffssi2(x) == ctzsi2(x) + 1. + movs r2, #(33 - CTZ_RESULT_OFFSET) + + // Test for zero, return unmodified. + cmp r0, #0 + bne SYM(__internal_ctzsi2) + RET + + CFI_END_FUNCTION +FUNC_END ffssi2 + +#endif /* L_ffssi2 */ + diff --git a/libgcc/config/arm/t-elf b/libgcc/config/arm/t-elf index 89071cebe45..346fc766f17 100644 --- a/libgcc/config/arm/t-elf +++ b/libgcc/config/arm/t-elf @@ -35,6 +35,8 @@ LIB1ASMFUNCS += \ _clrsbdi2 \ _clzdi2 \ _ctzdi2 \ + _ffssi2 \ + _ffsdi2 \ _dvmd_tls \ _divsi3 \ _modsi3 \ From patchwork Fri Jan 15 11:30:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426909 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=xl4x+vDd; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=LY1lZn19; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJtv1dltz9sVr for ; Fri, 15 Jan 2021 22:32:23 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7E42139730FF; Fri, 15 Jan 2021 11:31:37 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id BB7FB39730F9 for ; Fri, 15 Jan 2021 11:31:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org BB7FB39730F9 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute2.internal (compute2.nyi.internal [10.202.2.42]) by mailout.west.internal (Postfix) with ESMTP id D2945F40; Fri, 15 Jan 2021 06:31:33 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute2.internal (MEProxy); Fri, 15 Jan 2021 06:31:34 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=BB3ABrv90ySdx 0HzjcJK1bHvW4jsfK5ghJvbhMu0/5A=; b=xl4x+vDdXzeXEi46n+3JpbHoDOcdT +K3iaIO8dr69mCeOpWGMhJK9Xn/gmS+Oinfthcae/wEiw27qPdhmisXbugcn9Mxt QGgYJsBPRyLJidyjoI4CvZi4r8aoeKAFiTJAEUT0TLRBYTR/U9WqBRRSatUgFuC5 JckZZZMtwFIEX3wVUbN8P1xdoZdklV6Vr8y/pVSCG/PJba0kP7FqCX8yxRR9wfZh EH4O0wkAJHg7GY/YwQH/OZamQHxx9HIkDHzmTznRaQ/3Ub6ku77EGRvR2h3l1ZQ+ 8MzWTLuQV4nlPHiHH/LZ2zzSmV7qX2Rtbyz7ZZQkuiq6G65jrZQCuFCMA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=BB3ABrv90ySdx0HzjcJK1bHvW4jsfK5ghJvbhMu0/5A=; b=LY1lZn19 9wDvU3PosWrWrmKJYg4ZSQTaasW5QHN28Gz00ApFvfnmAFR0856zI6E+jv8L5wTc OHDfRsKCfIehxDKhqidXe63P9rU8iuy2/zkLPibgfShLypy1Zu+b0c+LjsesLStu ULYED6cgR86sTcjUif+3I0l349PZhig+68BH8jZkwZNBUBPoRBUEDeRiFOeb3qZl lVOHbTrWt6NTKMyPTkFnFmaoa9PF0DjX7ZkP1QxA/g5PZrk56MOy6FRZ+mH/4hwM 6TsOC1SZNhf7aLLfOUiFh8FSuAiGT+PUJj4iZXyrKBoE9uyPnWXSs1oSOZsbaL18 PjeZi8Hpd737sQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddthecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpeduudefkedvieduvefggfefheejgf eugedthfejleeutdetledtledutefgudevjeenucffohhmrghinheplhhisgdufhhunhgt shdrshgspdhprghrihhthidrshgspdhgnhhurdhorhhgnecukfhppeejuddrfeeirddutd dtrddvvddtnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhho mhepghhnuhesuggrnhhivghlvghnghgvlhdrtghomh X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id D47AF1080063; Fri, 15 Jan 2021 06:31:32 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBVVvK023733; Fri, 15 Jan 2021 03:31:31 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 14/33] Import 'parity' functions from the CM0 library Date: Fri, 15 Jan 2021 03:30:42 -0800 Message-Id: <866e943f2758532649867084817d6d181c3d1d21.1610709584.git.gnu@danielengel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-13.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SCC_5_SHORT_WORD_LINES, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" The functional overlap between the single- and double-word functions makes functions makes this implementation about half the size of the C functions if both functions are linked in the same application. gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/parity.S: New file for __paritysi2/di2(). * config/arm/lib1funcs.S: #include bit/parity.S * config/arm/t-elf (LIB1ASMFUNCS): Added _paritysi2/di2. --- libgcc/config/arm/lib1funcs.S | 1 + libgcc/config/arm/parity.S | 120 ++++++++++++++++++++++++++++++++++ libgcc/config/arm/t-elf | 2 + 3 files changed, 123 insertions(+) create mode 100644 libgcc/config/arm/parity.S diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S index 7ac50230725..600ea2dfdc9 100644 --- a/libgcc/config/arm/lib1funcs.S +++ b/libgcc/config/arm/lib1funcs.S @@ -1704,6 +1704,7 @@ LSYM(Lover12): #include "clz2.S" #include "ctz2.S" +#include "parity.S" /* ------------------------------------------------------------------------ */ /* These next two sections are here despite the fact that they contain Thumb diff --git a/libgcc/config/arm/parity.S b/libgcc/config/arm/parity.S new file mode 100644 index 00000000000..45233bc9d8f --- /dev/null +++ b/libgcc/config/arm/parity.S @@ -0,0 +1,120 @@ +/* parity.S: ARM optimized parity functions + + Copyright (C) 2020-2021 Free Software Foundation, Inc. + Contributed by Daniel Engel (gnu@danielengel.com) + + This file is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the + Free Software Foundation; either version 3, or (at your option) any + later version. + + This file is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + + +#ifdef L_paritydi2 + +// int __paritydi2(int) +// Returns '0' if the number of bits set in $r1:r0 is even, and '1' otherwise. +// Returns the result in $r0. +FUNC_START_SECTION paritydi2 .text.sorted.libgcc.paritydi2 + CFI_START_FUNCTION + + // Combine the upper and lower words, then fall through. + // Byte-endianness does not matter for this function. + eors r0, r1 + +#endif /* L_paritydi2 */ + + +// The implementation of __paritydi2() tightly couples with __paritysi2(), +// such that instructions must appear consecutively in the same memory +// section for proper flow control. However, this construction inhibits +// the ability to discard __paritydi2() when only using __paritysi2(). +// Therefore, this block configures __paritysi2() for compilation twice. +// The first version is a minimal standalone implementation, and the second +// version is the continuation of __paritydi2(). The standalone version must +// be declared WEAK, so that the combined version can supersede it and +// provide both symbols when required. +// '_paritysi2' should appear before '_paritydi2' in LIB1ASMFUNCS. +#if defined(L_paritysi2) || defined(L_paritydi2) + +#ifdef L_paritysi2 +// int __paritysi2(int) +// Returns '0' if the number of bits set in $r0 is even, and '1' otherwise. +// Returns the result in $r0. +// Uses $r2 as scratch space. +WEAK_START_SECTION paritysi2 .text.sorted.libgcc.paritysi2 + CFI_START_FUNCTION + +#else /* L_paritydi2 */ +FUNC_ENTRY paritysi2 + +#endif + + #if defined(__thumb__) && __thumb__ + #if defined(__OPTIMIZE_SIZE__) && __OPTIMIZE_SIZE__ + + // Size optimized: 16 bytes, 40 cycles + // Speed optimized: 24 bytes, 14 cycles + movs r2, #16 + + LLSYM(__parity_loop): + // Calculate the parity of successively smaller half-words into the MSB. + movs r1, r0 + lsls r1, r2 + eors r0, r1 + lsrs r2, #1 + bne LLSYM(__parity_loop) + + #else /* !__OPTIMIZE_SIZE__ */ + + // Unroll the loop. The 'libgcc' reference C implementation replaces + // the x2 and the x1 shifts with a constant. However, since it takes + // 4 cycles to load, index, and mask the constant result, it doesn't + // cost anything to keep shifting (and saves a few bytes). + lsls r1, r0, #16 + eors r0, r1 + lsls r1, r0, #8 + eors r0, r1 + lsls r1, r0, #4 + eors r0, r1 + lsls r1, r0, #2 + eors r0, r1 + lsls r1, r0, #1 + eors r0, r1 + + #endif /* !__OPTIMIZE_SIZE__ */ + #else /* !__thumb__ */ + + eors r0, r0, r0, lsl #16 + eors r0, r0, r0, lsl #8 + eors r0, r0, r0, lsl #4 + eors r0, r0, r0, lsl #2 + eors r0, r0, r0, lsl #1 + + #endif /* !__thumb__ */ + + lsrs r0, #31 + RET + + CFI_END_FUNCTION +FUNC_END paritysi2 + +#ifdef L_paritydi2 +FUNC_END paritydi2 +#endif + +#endif /* L_paritysi2 || L_paritydi2 */ + diff --git a/libgcc/config/arm/t-elf b/libgcc/config/arm/t-elf index 346fc766f17..0e9b9ce21af 100644 --- a/libgcc/config/arm/t-elf +++ b/libgcc/config/arm/t-elf @@ -24,6 +24,7 @@ endif # !__symbian__ LIB1ASMFUNCS += \ _clzsi2 \ _ctzsi2 \ + _paritysi2 \ # Group 1: Integer function objects. @@ -37,6 +38,7 @@ LIB1ASMFUNCS += \ _ctzdi2 \ _ffssi2 \ _ffsdi2 \ + _paritydi2 \ _dvmd_tls \ _divsi3 \ _modsi3 \ From patchwork Fri Jan 15 11:30:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426910 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=E/ai1EnJ; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=GKlkGjAh; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJv01lBjz9sRK for ; Fri, 15 Jan 2021 22:32:28 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E2B6D39730FC; Fri, 15 Jan 2021 11:31:39 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id A39863973049 for ; Fri, 15 Jan 2021 11:31:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org A39863973049 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id BDA11F71; Fri, 15 Jan 2021 06:31:35 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 15 Jan 2021 06:31:35 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=gj+mpZxcaF0D5 4eHPYFGN2K7fPdVtbGLjxfIhLhsfJQ=; b=E/ai1EnJqIJpoJh11zkDylar07gUO BVupk+BJcayWApd836H79PlFSnwZO06SCV6vltP9Rmbt5w0qmlSNA16mWsqQ6c7J u3bWB6h0EthLELIqeUOt5OI7b17iyH+N2aF3evOX6U3ITog4f0OLfxSUiwOUnWem VfsIy1m+HMauJuoBwX1iLMwdbFsicfKKT0kBDEOjixN6Ail1lBnBNtPAysjxztFr PmZD8J/eoz5YuYmt+cLNoyGlig7YL39UHzTFl95SU9BRhXPKccMpF6J/V44YCaUq Ajt2WxMk6t1Y98XauT9WiLnviveli+ZHb0loknpKvKslZqfkz6An79WVg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=gj+mpZxcaF0D54eHPYFGN2K7fPdVtbGLjxfIhLhsfJQ=; b=GKlkGjAh AVlu79qdv8kyy9fmJ72F/dwokeAxXWAVO6mf7aaaREzjekvF0tVBet/uSnSjs4wI IIdQod+OsrCw0XA/62FR6QWuNHkAVJg2bIIxnWRIIcQW/S7DknNrPdgYUWiK5ytP OiNnHeup3ELHm2S7BdaRkaB0hFwzKutcBjEO9YllavWgobdMIiLO1NdYRGPFcrG7 I1QMotTcnKVRUOU7xC25AItGmWpFsT5xD4YgDzL7pr+pWgxR/uqUbyv5c1DDxCKJ /SvFL2DYDl1mhvE+rocNHpBf/sPvalfxTrOV6grm7hqj5jThq2T96DdbIydEfSMb RP/21y+QGo9peA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddtgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpeetkeeikeevieegjeetheejueffte dugeevfedvuefhfeduhfffgfefleegvdfhgeenucffohhmrghinheplhhisgdufhhunhgt shdrshgspdhpohhptghnthdrshgspdhgnhhurdhorhhgnecukfhppeejuddrfeeirddutd dtrddvvddtnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhho mhepghhnuhesuggrnhhivghlvghnghgvlhdrtghomh X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id 001B21080064; Fri, 15 Jan 2021 06:31:34 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBVXHR023736; Fri, 15 Jan 2021 03:31:33 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 15/33] Import 'popcnt' functions from the CM0 library Date: Fri, 15 Jan 2021 03:30:43 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-13.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SCC_5_SHORT_WORD_LINES, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" The functional overlap between the single- and double-word functions makes this implementation about 30% smaller than the C functions if both functions are linked together in the same appliation. gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/popcnt.S (__popcountsi, __popcountdi2): New file. * config/arm/lib1funcs.S: #include bit/popcnt.S * config/arm/t-elf (LIB1ASMFUNCS): Add _popcountsi2/di2. --- libgcc/config/arm/lib1funcs.S | 1 + libgcc/config/arm/popcnt.S | 189 ++++++++++++++++++++++++++++++++++ libgcc/config/arm/t-elf | 2 + 3 files changed, 192 insertions(+) create mode 100644 libgcc/config/arm/popcnt.S diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S index 600ea2dfdc9..bd84a3e4281 100644 --- a/libgcc/config/arm/lib1funcs.S +++ b/libgcc/config/arm/lib1funcs.S @@ -1705,6 +1705,7 @@ LSYM(Lover12): #include "clz2.S" #include "ctz2.S" #include "parity.S" +#include "popcnt.S" /* ------------------------------------------------------------------------ */ /* These next two sections are here despite the fact that they contain Thumb diff --git a/libgcc/config/arm/popcnt.S b/libgcc/config/arm/popcnt.S new file mode 100644 index 00000000000..51b1ed745ee --- /dev/null +++ b/libgcc/config/arm/popcnt.S @@ -0,0 +1,189 @@ +/* popcnt.S: ARM optimized popcount functions + + Copyright (C) 2020-2021 Free Software Foundation, Inc. + Contributed by Daniel Engel (gnu@danielengel.com) + + This file is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the + Free Software Foundation; either version 3, or (at your option) any + later version. + + This file is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + + +#ifdef L_popcountdi2 + +// int __popcountdi2(int) +// Returns the number of bits set in $r1:$r0. +// Returns the result in $r0. +FUNC_START_SECTION popcountdi2 .text.sorted.libgcc.popcountdi2 + CFI_START_FUNCTION + + #if defined(__OPTIMIZE_SIZE__) && __OPTIMIZE_SIZE__ + // Initialize the result. + // Compensate for the two extra loop (one for each word) + // required to detect zero arguments. + movs r2, #2 + + LLSYM(__popcountd_loop): + // Same as __popcounts_loop below, except for $r1. + subs r2, #1 + subs r3, r1, #1 + ands r1, r3 + bcs LLSYM(__popcountd_loop) + + // Repeat the operation for the second word. + b LLSYM(__popcounts_loop) + + #else /* !__OPTIMIZE_SIZE__ */ + // Load the one-bit alternating mask. + ldr r3, =0x55555555 + + // Reduce the second word. + lsrs r2, r1, #1 + ands r2, r3 + subs r1, r2 + + // Reduce the first word. + lsrs r2, r0, #1 + ands r2, r3 + subs r0, r2 + + // Load the two-bit alternating mask. + ldr r3, =0x33333333 + + // Reduce the second word. + lsrs r2, r1, #2 + ands r2, r3 + ands r1, r3 + adds r1, r2 + + // Reduce the first word. + lsrs r2, r0, #2 + ands r2, r3 + ands r0, r3 + adds r0, r2 + + // There will be a maximum of 8 bits in each 4-bit field. + // Jump into the single word flow to combine and complete. + b LLSYM(__popcounts_merge) + + #endif /* !__OPTIMIZE_SIZE__ */ +#endif /* L_popcountdi2 */ + + +// The implementation of __popcountdi2() tightly couples with __popcountsi2(), +// such that instructions must appear consecutively in the same memory +// section for proper flow control. However, this construction inhibits +// the ability to discard __popcountdi2() when only using __popcountsi2(). +// Therefore, this block configures __popcountsi2() for compilation twice. +// The first version is a minimal standalone implementation, and the second +// version is the continuation of __popcountdi2(). The standalone version must +// be declared WEAK, so that the combined version can supersede it and +// provide both symbols when required. +// '_popcountsi2' should appear before '_popcountdi2' in LIB1ASMFUNCS. +#if defined(L_popcountsi2) || defined(L_popcountdi2) + +#ifdef L_popcountsi2 +// int __popcountsi2(int) +// Returns '0' if the number of bits set in $r0 is even, and '1' otherwise. +// Returns the result in $r0. +// Uses $r2 as scratch space. +WEAK_START_SECTION popcountsi2 .text.sorted.libgcc.popcountsi2 + CFI_START_FUNCTION + +#else /* L_popcountdi2 */ +FUNC_ENTRY popcountsi2 + +#endif + + #if defined(__OPTIMIZE_SIZE__) && __OPTIMIZE_SIZE__ + // Initialize the result. + // Compensate for the extra loop required to detect zero. + movs r2, #1 + + // Kernighan's algorithm for __popcount(x): + // for (c = 0; x; c++) + // x &= x - 1; + + LLSYM(__popcounts_loop): + // Every loop counts for a '1' set in the argument. + // Count down since it's easier to initialize positive compensation, + // and the negation before function return is free. + subs r2, #1 + + // Clear one bit per loop. + subs r3, r0, #1 + ands r0, r3 + + // If this is a test for zero, it will be impossible to distinguish + // between zero and one bits set: both terminate after one loop. + // Instead, subtraction underflow flags when zero entered the loop. + bcs LLSYM(__popcounts_loop) + + // Invert the result, since we have been counting negative. + rsbs r0, r2, #0 + RET + + #else /* !__OPTIMIZE_SIZE__ */ + + // Load the one-bit alternating mask. + ldr r3, =0x55555555 + + // Reduce the word. + lsrs r1, r0, #1 + ands r1, r3 + subs r0, r1 + + // Load the two-bit alternating mask. + ldr r3, =0x33333333 + + // Reduce the word. + lsrs r1, r0, #2 + ands r0, r3 + ands r1, r3 + LLSYM(__popcounts_merge): + adds r0, r1 + + // Load the four-bit alternating mask. + ldr r3, =0x0F0F0F0F + + // Reduce the word. + lsrs r1, r0, #4 + ands r0, r3 + ands r1, r3 + adds r0, r1 + + // Accumulate individual byte sums into the MSB. + lsls r1, r0, #8 + adds r0, r1 + lsls r1, r0, #16 + adds r0, r1 + + // Isolate the cumulative sum. + lsrs r0, #24 + RET + + #endif /* !__OPTIMIZE_SIZE__ */ + + CFI_END_FUNCTION +FUNC_END popcountsi2 + +#ifdef L_popcountdi2 +FUNC_END popcountdi2 +#endif + +#endif /* L_popcountsi2 || L_popcountdi2 */ + diff --git a/libgcc/config/arm/t-elf b/libgcc/config/arm/t-elf index 0e9b9ce21af..2e3f04aa2f0 100644 --- a/libgcc/config/arm/t-elf +++ b/libgcc/config/arm/t-elf @@ -25,6 +25,7 @@ LIB1ASMFUNCS += \ _clzsi2 \ _ctzsi2 \ _paritysi2 \ + _popcountsi2 \ # Group 1: Integer function objects. @@ -39,6 +40,7 @@ LIB1ASMFUNCS += \ _ffssi2 \ _ffsdi2 \ _paritydi2 \ + _popcountdi2 \ _dvmd_tls \ _divsi3 \ _modsi3 \ From patchwork Fri Jan 15 11:30:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426911 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=TRHKHnVC; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=ECRgSwnW; dkim-atps=neutral Received: from sourceware.org (unknown [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJv54fR1z9sRK for ; Fri, 15 Jan 2021 22:32:33 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 511433973122; Fri, 15 Jan 2021 11:31:41 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id CF6A73973049 for ; Fri, 15 Jan 2021 11:31:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org CF6A73973049 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id D4BB8EBC; Fri, 15 Jan 2021 06:31:37 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 15 Jan 2021 06:31:38 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=+jg/M2rD61KvL gRAkbI80kCQ3hn84b063qPBrj4sQmo=; b=TRHKHnVC7tGo/BEGTLsKW425G0QV2 SSMPqWtp1waqqWb6xHSlRqVYmU62D65pKA/L+mV60KCx4ZSwrjvBdKwOacumM1qs lOt+15xL/0e5B/BOiY6BVfqgN7KKxdvwBQyUi2i0DVyG5bWZOq8kDQ5hoVNX6JsR 8LEbM5dHRfCeuL1HvUkxcEKFfguHMvqc2PQ/1th6w2eikjKoGVcALzHMG5/2aVzO 88lZj+BjMKNCj1Ekr6JfqeCEekn/VX+8jFYKKBZdiVmmPqd+cypFeh2p7JDiVcTB IXLnvGDxMIpGeckOXulW4aUb65T3Az2hFLccYzsqcH7Yi6P2R+P3E/uEg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=+jg/M2rD61KvLgRAkbI80kCQ3hn84b063qPBrj4sQmo=; b=ECRgSwnW bDd0a3xjJ+04WDMfd5n8W+mekgM6oLETwEHBOUK75I+IKbU86vDAGO3UpPnjCK4a d0Z4rX0lhkF2JsOqGy8EZWMkdbNxajRKp0hwb6AF/382HxA1CD7/FmxBuE2IBMct 7Y9Cq/Kz0sk41e4aYArOUftL3ThlIW9ezpzZc8OwELasVAOAI2mnUlV32T+DWAFi ad0eFj+nPCGkGL0YTciQ8V1IluEDGx1Pbnhuw5l5FJoeYAtC6EnhB/HdqvAezP7c b4hX3jtPGcvAtKdQ7KmSWccX2zTiQ2GRDKL7WcDsO5WuiLtlg/fCBtHiCUjg5qst vHgDT+J72Bo6qA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddtgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpeetteeujefhueetgeejieehfeegue etleevtddthfegveeuhfevieevveefheeigfenucffohhmrghinhepsghprggsihdqvhei mhdrshgspdhltghmphdrshgspdhgnhhurdhorhhgpdhlihgsudhfuhhntghsrdhssgenuc fkphepjedurdefiedruddttddrvddvtdenucevlhhushhtvghrufhiiigvpedtnecurfgr rhgrmhepmhgrihhlfhhrohhmpehgnhhusegurghnihgvlhgvnhhgvghlrdgtohhm X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id 20E2F108005B; Fri, 15 Jan 2021 06:31:37 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBVa8J023739; Fri, 15 Jan 2021 03:31:36 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 16/33] Refactor Thumb-1 64-bit comparison into a new file Date: Fri, 15 Jan 2021 03:30:44 -0800 Message-Id: <25486c843cf70340dc396ceb7f346d001b284863.1610709584.git.gnu@danielengel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-13.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" This will make it easier to isolate changes in subsequent patches. gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/bpabi-v6m.S (__aeabi_lcmp, __aeabi_ulcmp): Moved to ... * config/arm/eabi/lcmp.S: New file. * config/arm/lib1funcs.S: #include eabi/lcmp.S. --- libgcc/config/arm/bpabi-v6m.S | 46 ---------------------- libgcc/config/arm/eabi/lcmp.S | 73 +++++++++++++++++++++++++++++++++++ libgcc/config/arm/lib1funcs.S | 1 + 3 files changed, 74 insertions(+), 46 deletions(-) create mode 100644 libgcc/config/arm/eabi/lcmp.S diff --git a/libgcc/config/arm/bpabi-v6m.S b/libgcc/config/arm/bpabi-v6m.S index 069fcbbf48c..a051c1530a4 100644 --- a/libgcc/config/arm/bpabi-v6m.S +++ b/libgcc/config/arm/bpabi-v6m.S @@ -33,52 +33,6 @@ .eabi_attribute 25, 1 #endif /* __ARM_EABI__ */ -#ifdef L_aeabi_lcmp - -FUNC_START aeabi_lcmp - cmp xxh, yyh - beq 1f - bgt 2f - movs r0, #1 - negs r0, r0 - RET -2: - movs r0, #1 - RET -1: - subs r0, xxl, yyl - beq 1f - bhi 2f - movs r0, #1 - negs r0, r0 - RET -2: - movs r0, #1 -1: - RET - FUNC_END aeabi_lcmp - -#endif /* L_aeabi_lcmp */ - -#ifdef L_aeabi_ulcmp - -FUNC_START aeabi_ulcmp - cmp xxh, yyh - bne 1f - subs r0, xxl, yyl - beq 2f -1: - bcs 1f - movs r0, #1 - negs r0, r0 - RET -1: - movs r0, #1 -2: - RET - FUNC_END aeabi_ulcmp - -#endif /* L_aeabi_ulcmp */ .macro test_div_by_zero signed cmp yyh, #0 diff --git a/libgcc/config/arm/eabi/lcmp.S b/libgcc/config/arm/eabi/lcmp.S new file mode 100644 index 00000000000..336db1d398c --- /dev/null +++ b/libgcc/config/arm/eabi/lcmp.S @@ -0,0 +1,73 @@ +/* Miscellaneous BPABI functions. Thumb-1 implementation, suitable for ARMv4T, + ARMv6-M and ARMv8-M Baseline like ISA variants. + + Copyright (C) 2006-2020 Free Software Foundation, Inc. + Contributed by CodeSourcery. + + This file is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the + Free Software Foundation; either version 3, or (at your option) any + later version. + + This file is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + + +#ifdef L_aeabi_lcmp + +FUNC_START aeabi_lcmp + cmp xxh, yyh + beq 1f + bgt 2f + movs r0, #1 + negs r0, r0 + RET +2: + movs r0, #1 + RET +1: + subs r0, xxl, yyl + beq 1f + bhi 2f + movs r0, #1 + negs r0, r0 + RET +2: + movs r0, #1 +1: + RET + FUNC_END aeabi_lcmp + +#endif /* L_aeabi_lcmp */ + +#ifdef L_aeabi_ulcmp + +FUNC_START aeabi_ulcmp + cmp xxh, yyh + bne 1f + subs r0, xxl, yyl + beq 2f +1: + bcs 1f + movs r0, #1 + negs r0, r0 + RET +1: + movs r0, #1 +2: + RET + FUNC_END aeabi_ulcmp + +#endif /* L_aeabi_ulcmp */ + diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S index bd84a3e4281..5e24d0a6749 100644 --- a/libgcc/config/arm/lib1funcs.S +++ b/libgcc/config/arm/lib1funcs.S @@ -1991,5 +1991,6 @@ LSYM(Lchange_\register): #include "bpabi.S" #else /* NOT_ISA_TARGET_32BIT */ #include "bpabi-v6m.S" +#include "eabi/lcmp.S" #endif /* NOT_ISA_TARGET_32BIT */ #endif /* !__symbian__ */ From patchwork Fri Jan 15 11:30:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426912 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=S816FUpK; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=fOcH3fBm; dkim-atps=neutral Received: from sourceware.org (unknown [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJv95kkyz9sRK for ; Fri, 15 Jan 2021 22:32:37 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 407A43982401; Fri, 15 Jan 2021 11:31:44 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id 5356A3982401 for ; Fri, 15 Jan 2021 11:31:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 5356A3982401 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id EE047F82; Fri, 15 Jan 2021 06:31:39 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 15 Jan 2021 06:31:40 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=6KqfPaa+JDV7S qJA2s4sakLd7m95dKkyjQaGAka/SVE=; b=S816FUpKNuWqd8Yh2Xc2Vs7WqUwFT l00xIfAnoKMFWm9JMFvkMUK4qB7AHr/nnSsmWRGp0nPFOqjT6Ilu9ZWriGfqozbP ymTLC4Qo/JLHgX8V+uOfrg8IwvE0iEmGFYUkF1+iF/FkrILcNrJ7kCGDKMGdto3u YMT5MhPnbkUQqMmF1VBK3Y9YG0Dhu8E06RCDIe+QZouQ+8+sENbjfTAE/xe1gmxi HN0h3e3z3CtI8hIaqPTrna3Hv/pHuPvahPHwvI2KD/hokLvKTVbowboTVV4ZAWzf xbhUBqRBcQpJiBJFf19PzTd9v4JrQPJLfzGoqGdEwcmy+W59GY6PKkVuQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=6KqfPaa+JDV7SqJA2s4sakLd7m95dKkyjQaGAka/SVE=; b=fOcH3fBm PXwNvkB6KRRyT+pqxBlPWA+Fj7BPlM5Q/TRkrcf8/nEQ305aSlhE+Dd5UeL7sMV0 qUotZo/2iD/Iw1Hutc5aBiU+gUdTMiCUxvc2n5n0QO60ZhzldYc21mn2QL08c8zW Y99JxeQv5jVpq/5eaqUjKh69tlhU/PgxsMmwy0Lz0HB3gNeVDmYq099g+Jpl3JsO ksmWeFAcPDz8SmtHj1jgjDiSBuM2OfQraR5dTg0n/e1ec3VgIt4GnFEcgkYviZeF aZn66RSrUnRu0iTcPvlSeKIu+24mV2uT7gJey0yWus5pXXRlAidJeCcMDA1imSXE FpQQ3bdSodk2rw== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddtgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpeeiudeiudefgfeifeevvefgheefve dvieegteduleeileekheeftdejkeetieevteenucffohhmrghinheplhgtmhhprdhssgdp ghhnuhdrohhrghenucfkphepjedurdefiedruddttddrvddvtdenucevlhhushhtvghruf hiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehgnhhusegurghnihgvlhgvnhhg vghlrdgtohhm X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id 2C762108005B; Fri, 15 Jan 2021 06:31:39 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBVcro023742; Fri, 15 Jan 2021 03:31:38 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 17/33] Import 64-bit comparison from CM0 library Date: Fri, 15 Jan 2021 03:30:45 -0800 Message-Id: <7e3c8fade6ab4ace586ab703fe79fd34c1c259c5.1610709584.git.gnu@danielengel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-13.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" These are 2-5 instructions smaller and just as fast. Branches are minimized, which will allow easier adaptation to Thumb-2/ARM mode. gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/eabi/lcmp.S (__aeabi_lcmp, __aeabi_ulcmp): Replaced; add macro configuration to build __cmpdi2() and __ucmpdi2(). * config/arm/t-elf (LIB1ASMFUNCS): Added _cmpdi2 and _ucmpdi2. --- libgcc/config/arm/eabi/lcmp.S | 151 +++++++++++++++++++++++++--------- libgcc/config/arm/t-elf | 2 + 2 files changed, 112 insertions(+), 41 deletions(-) diff --git a/libgcc/config/arm/eabi/lcmp.S b/libgcc/config/arm/eabi/lcmp.S index 336db1d398c..2ac9d178b34 100644 --- a/libgcc/config/arm/eabi/lcmp.S +++ b/libgcc/config/arm/eabi/lcmp.S @@ -1,8 +1,7 @@ -/* Miscellaneous BPABI functions. Thumb-1 implementation, suitable for ARMv4T, - ARMv6-M and ARMv8-M Baseline like ISA variants. +/* lcmp.S: Thumb-1 optimized 64-bit integer comparison - Copyright (C) 2006-2020 Free Software Foundation, Inc. - Contributed by CodeSourcery. + Copyright (C) 2018-2021 Free Software Foundation, Inc. + Contributed by Daniel Engel, Senva Inc (gnu@danielengel.com) This file is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the @@ -24,50 +23,120 @@ . */ +#if defined(L_aeabi_lcmp) || defined(L_cmpdi2) + #ifdef L_aeabi_lcmp + #define LCMP_NAME aeabi_lcmp + #define LCMP_SECTION .text.sorted.libgcc.lcmp +#else + #define LCMP_NAME cmpdi2 + #define LCMP_SECTION .text.sorted.libgcc.cmpdi2 +#endif + +// int __aeabi_lcmp(long long, long long) +// int __cmpdi2(long long, long long) +// Compares the 64 bit signed values in $r1:$r0 and $r3:$r2. +// lcmp() returns $r0 = { -1, 0, +1 } for orderings { <, ==, > } respectively. +// cmpdi2() returns $r0 = { 0, 1, 2 } for orderings { <, ==, > } respectively. +// Object file duplication assumes typical programs follow one runtime ABI. +FUNC_START_SECTION LCMP_NAME LCMP_SECTION + CFI_START_FUNCTION + + // Calculate the difference $r1:$r0 - $r3:$r2. + subs xxl, yyl + sbcs xxh, yyh + + // With $r2 free, create a known offset value without affecting + // the N or Z flags. + // BUG? The originally unified instruction for v6m was 'mov r2, r3'. + // However, this resulted in a compile error with -mthumb: + // "MOV Rd, Rs with two low registers not permitted". + // Since unified syntax deprecates the "cpy" instruction, shouldn't + // there be a backwards-compatible tranlation available? + cpy r2, r3 + + // Evaluate the comparison result. + blt LLSYM(__lcmp_lt) + + // The reference offset ($r2 - $r3) will be +2 iff the first + // argument is larger, otherwise the offset value remains 0. + adds r2, #2 + + // Check for zero (equality in 64 bits). + // It doesn't matter which register was originally "hi". + orrs r0, r1 + + // The result is already 0 on equality. + beq LLSYM(__lcmp_return) + + LLSYM(__lcmp_lt): + // Create +1 or -1 from the offset value defined earlier. + adds r3, #1 + subs r0, r2, r3 + + LLSYM(__lcmp_return): + #ifdef L_cmpdi2 + // Offset to the correct output specification. + adds r0, #1 + #endif -FUNC_START aeabi_lcmp - cmp xxh, yyh - beq 1f - bgt 2f - movs r0, #1 - negs r0, r0 - RET -2: - movs r0, #1 - RET -1: - subs r0, xxl, yyl - beq 1f - bhi 2f - movs r0, #1 - negs r0, r0 - RET -2: - movs r0, #1 -1: RET - FUNC_END aeabi_lcmp -#endif /* L_aeabi_lcmp */ + CFI_END_FUNCTION +FUNC_END LCMP_NAME + +#endif /* L_aeabi_lcmp || L_cmpdi2 */ + + +#if defined(L_aeabi_ulcmp) || defined(L_ucmpdi2) #ifdef L_aeabi_ulcmp + #define ULCMP_NAME aeabi_ulcmp + #define ULCMP_SECTION .text.sorted.libgcc.ulcmp +#else + #define ULCMP_NAME ucmpdi2 + #define ULCMP_SECTION .text.sorted.libgcc.ucmpdi2 +#endif + +// int __aeabi_ulcmp(unsigned long long, unsigned long long) +// int __ucmpdi2(unsigned long long, unsigned long long) +// Compares the 64 bit unsigned values in $r1:$r0 and $r3:$r2. +// ulcmp() returns $r0 = { -1, 0, +1 } for orderings { <, ==, > } respectively. +// ucmpdi2() returns $r0 = { 0, 1, 2 } for orderings { <, ==, > } respectively. +// Object file duplication assumes typical programs follow one runtime ABI. +FUNC_START_SECTION ULCMP_NAME ULCMP_SECTION + CFI_START_FUNCTION + + // Calculate the 'C' flag. + subs xxl, yyl + sbcs xxh, yyh + + // Capture the carry flg. + // $r2 will contain -1 if the first value is smaller, + // 0 if the first value is larger or equal. + sbcs r2, r2 + + // Check for zero (equality in 64 bits). + // It doesn't matter which register was originally "hi". + orrs r0, r1 + + // The result is already 0 on equality. + beq LLSYM(__ulcmp_return) + + // Assume +1. If -1 is correct, $r2 will override. + movs r0, #1 + orrs r0, r2 + + LLSYM(__ulcmp_return): + #ifdef L_ucmpdi2 + // Offset to the correct output specification. + adds r0, #1 + #endif -FUNC_START aeabi_ulcmp - cmp xxh, yyh - bne 1f - subs r0, xxl, yyl - beq 2f -1: - bcs 1f - movs r0, #1 - negs r0, r0 - RET -1: - movs r0, #1 -2: RET - FUNC_END aeabi_ulcmp -#endif /* L_aeabi_ulcmp */ + CFI_END_FUNCTION +FUNC_END ULCMP_NAME + +#endif /* L_aeabi_ulcmp || L_ucmpdi2 */ diff --git a/libgcc/config/arm/t-elf b/libgcc/config/arm/t-elf index 2e3f04aa2f0..83325410097 100644 --- a/libgcc/config/arm/t-elf +++ b/libgcc/config/arm/t-elf @@ -41,6 +41,8 @@ LIB1ASMFUNCS += \ _ffsdi2 \ _paritydi2 \ _popcountdi2 \ + _cmpdi2 \ + _ucmpdi2 \ _dvmd_tls \ _divsi3 \ _modsi3 \ From patchwork Fri Jan 15 11:30:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426913 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=uMsvu6+t; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=ACeMswRg; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJvG4Z7Tz9sRK for ; Fri, 15 Jan 2021 22:32:42 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AA256398240A; Fri, 15 Jan 2021 11:31:45 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id F082D3982404 for ; Fri, 15 Jan 2021 11:31:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org F082D3982404 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 151F8F7B; Fri, 15 Jan 2021 06:31:42 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 15 Jan 2021 06:31:42 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=J9YdNbBT+PqJO Z4iSHu8a2+PW2MuqUgRVB4YQpI2mow=; b=uMsvu6+tMnaa49Kk0QgIQFs2Wy66s 9yR1Lgjdb/H2nOR0+iUspOfF2D28jo8lYBpT9Cg8TyimPn/DK0oL/K1Hgpi1Uddv Ov70XR4obORVzEiQDdCId9eGusqUbVl819fOHNLiC8PGvz80OnAFzFGPyZVsmVuf 13/FessfrrBZtvM+R1UiB7Uyk2P5W3U+AX0NIzaLIP8WqkvC1h7rmALEY44gUOfE h6BPzvjOKXkK7QkzgBHm1AIv8932JBuQoawARoLIAM/qcRv+zSjmutSzuuHnWOJq tNvnOja7JjlcVg/MYSjU2hYRrrhHrarRV+EfziUPxZKbdLKrw1vh0SH1Q== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=J9YdNbBT+PqJOZ4iSHu8a2+PW2MuqUgRVB4YQpI2mow=; b=ACeMswRg L9dSZAXAgHh4rLo1/z7DfYvAJ/FngUA464OmWs/vwrBygb+8qlxuuvAJS8ihrAxh 9S6as9TrzGOTr8bUNG+7dTmWUQXeR4R1DE76Us5Mvw9fAwMPD5+HjSXhgC4ikEH3 1hW0oZv14KT8r64JU8AzaHeW9YjClDQLTI5ZqxlxdED17oLrKH6zabs8SEmjvBI6 7s4zN7AUafQJNnvkRizDhkIy9zQs5KcQYXeEgHBMzII4nHvrrcztux9MVWkl7n02 I89eHDq9XdFkbmjuX8/kA0pOdGkwqs5WS6Y6AfVp/yXgDHWjsGFkaM6x9m+TqJUK 6n5z1EtBEOKarw== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddtgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpeefveevheefvdffuefgieduueegud effedukeejueffvdeitddtvddvhfdvgfdtheenucffohhmrghinhepsghprggsihdrshgs pdhltghmphdrshgspdhlihgsudhfuhhntghsrdhssgenucfkphepjedurdefiedruddttd drvddvtdenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhm pehgnhhusegurghnihgvlhgvnhhgvghlrdgtohhm X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id 481FE108005B; Fri, 15 Jan 2021 06:31:41 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBVeWp023745; Fri, 15 Jan 2021 03:31:40 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 18/33] Merge Thumb-2 optimizations for 64-bit comparison Date: Fri, 15 Jan 2021 03:30:46 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-13.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" This effectively merges support for all architecture variants into a common function path with appropriate build conditions. ARM performance is 1-2 instructions faster; Thumb-2 is about 50% faster. gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/bpabi.S (__aeabi_lcmp, __aeabi_ulcmp): Removed. * config/arm/eabi/lcmp.S (__aeabi_lcmp, __aeabi_ulcmp): Added conditional execution on supported architectures (__ARM_FEATURE_IT). * config/arm/lib1funcs.S: Moved #include scope of eabi/lcmp.S. --- libgcc/config/arm/bpabi.S | 42 ------------------------------- libgcc/config/arm/eabi/lcmp.S | 47 ++++++++++++++++++++++++++++++++++- libgcc/config/arm/lib1funcs.S | 2 +- 3 files changed, 47 insertions(+), 44 deletions(-) diff --git a/libgcc/config/arm/bpabi.S b/libgcc/config/arm/bpabi.S index 2cbb67d54ad..4281a2be594 100644 --- a/libgcc/config/arm/bpabi.S +++ b/libgcc/config/arm/bpabi.S @@ -34,48 +34,6 @@ .eabi_attribute 25, 1 #endif /* __ARM_EABI__ */ -#ifdef L_aeabi_lcmp - -ARM_FUNC_START aeabi_lcmp - cmp xxh, yyh - do_it lt - movlt r0, #-1 - do_it gt - movgt r0, #1 - do_it ne - RETc(ne) - subs r0, xxl, yyl - do_it lo - movlo r0, #-1 - do_it hi - movhi r0, #1 - RET - FUNC_END aeabi_lcmp - -#endif /* L_aeabi_lcmp */ - -#ifdef L_aeabi_ulcmp - -ARM_FUNC_START aeabi_ulcmp - cmp xxh, yyh - do_it lo - movlo r0, #-1 - do_it hi - movhi r0, #1 - do_it ne - RETc(ne) - cmp xxl, yyl - do_it lo - movlo r0, #-1 - do_it hi - movhi r0, #1 - do_it eq - moveq r0, #0 - RET - FUNC_END aeabi_ulcmp - -#endif /* L_aeabi_ulcmp */ - .macro test_div_by_zero signed /* Tail-call to divide-by-zero handlers which may be overridden by the user, so unwinding works properly. */ diff --git a/libgcc/config/arm/eabi/lcmp.S b/libgcc/config/arm/eabi/lcmp.S index 2ac9d178b34..f1a9c3b8fe0 100644 --- a/libgcc/config/arm/eabi/lcmp.S +++ b/libgcc/config/arm/eabi/lcmp.S @@ -46,6 +46,19 @@ FUNC_START_SECTION LCMP_NAME LCMP_SECTION subs xxl, yyl sbcs xxh, yyh + #ifdef __HAVE_FEATURE_IT + do_it lt,t + + #ifdef L_aeabi_lcmp + movlt r0, #-1 + #else + movlt r0, #0 + #endif + + // Early return on '<'. + RETc(lt) + + #else /* !__HAVE_FEATURE_IT */ // With $r2 free, create a known offset value without affecting // the N or Z flags. // BUG? The originally unified instruction for v6m was 'mov r2, r3'. @@ -62,17 +75,27 @@ FUNC_START_SECTION LCMP_NAME LCMP_SECTION // argument is larger, otherwise the offset value remains 0. adds r2, #2 + #endif + // Check for zero (equality in 64 bits). // It doesn't matter which register was originally "hi". orrs r0, r1 + #ifdef __HAVE_FEATURE_IT + // The result is already 0 on equality. + // -1 already returned, so just force +1. + do_it ne + movne r0, #1 + + #else /* !__HAVE_FEATURE_IT */ // The result is already 0 on equality. beq LLSYM(__lcmp_return) - LLSYM(__lcmp_lt): + LLSYM(__lcmp_lt): // Create +1 or -1 from the offset value defined earlier. adds r3, #1 subs r0, r2, r3 + #endif LLSYM(__lcmp_return): #ifdef L_cmpdi2 @@ -111,21 +134,43 @@ FUNC_START_SECTION ULCMP_NAME ULCMP_SECTION subs xxl, yyl sbcs xxh, yyh + #ifdef __HAVE_FEATURE_IT + do_it lo,t + + #ifdef L_aeabi_ulcmp + movlo r0, -1 + #else + movlo r0, #0 + #endif + + // Early return on '<'. + RETc(lo) + + #else // Capture the carry flg. // $r2 will contain -1 if the first value is smaller, // 0 if the first value is larger or equal. sbcs r2, r2 + #endif // Check for zero (equality in 64 bits). // It doesn't matter which register was originally "hi". orrs r0, r1 + #ifdef __HAVE_FEATURE_IT + // The result is already 0 on equality. + // -1 already returned, so just force +1. + do_it ne + movne r0, #1 + + #else /* !__HAVE_FEATURE_IT */ // The result is already 0 on equality. beq LLSYM(__ulcmp_return) // Assume +1. If -1 is correct, $r2 will override. movs r0, #1 orrs r0, r2 + #endif LLSYM(__ulcmp_return): #ifdef L_ucmpdi2 diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S index 5e24d0a6749..f41354f811e 100644 --- a/libgcc/config/arm/lib1funcs.S +++ b/libgcc/config/arm/lib1funcs.S @@ -1991,6 +1991,6 @@ LSYM(Lchange_\register): #include "bpabi.S" #else /* NOT_ISA_TARGET_32BIT */ #include "bpabi-v6m.S" -#include "eabi/lcmp.S" #endif /* NOT_ISA_TARGET_32BIT */ +#include "eabi/lcmp.S" #endif /* !__symbian__ */ From patchwork Fri Jan 15 11:30:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426914 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=DThpGmlD; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=KfYWfRnj; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJvM0cc1z9sSC for ; Fri, 15 Jan 2021 22:32:47 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 25AE139730EA; Fri, 15 Jan 2021 11:31:49 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id 1F52F39730F2 for ; Fri, 15 Jan 2021 11:31:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 1F52F39730F2 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 37A64F54; Fri, 15 Jan 2021 06:31:44 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 15 Jan 2021 06:31:44 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=pVzcW+t8XPiQt zJB+K44U7g/IZSfOrDuiqL2nq9vOns=; b=DThpGmlDumqOk9ZbStX4qz2cjBL9L RDyYeezUTmPYwsQ/DXciAqJZHJonisvSGuIHmVjdrrx9UxfrtybzTz2IlLOkcvgE wLGaQn8JuIDf68bq4EH/Wkb2V5s7TZ03wZDxbWWkdQoT/uRdYnB6CaOAxJBF9Ool Swp4CUktaAptzJSSuz6JL7MhMzR2qoHjLiJnsA1SS6G8J0W7P2Vye3NaPecXh355 DQukRjYIHwJUQH+F7YIrzgrxg7RYlz1zThzIXdcju/yBVOT9fzIqKS0b7jgIA/0d mEzWkzTpabTP0vk0fW1T5nmZWXMnqSkHwQoEZFt+JC5xZYO4bc/SnEodA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=pVzcW+t8XPiQtzJB+K44U7g/IZSfOrDuiqL2nq9vOns=; b=KfYWfRnj Cw9iE5zM4t907GbQv1uX+zEScpr3GYLNvFe11UG2/1uopxfgUIhBQs0+MdLoInBd rnxZdH8oqBxNH9NLynfJDorGLGUk3Cw6F0aqkc/w+ilc+RvUQN5Yx2Lwcp9RErLW g5F1nbt7vh5zUORFlDuhDx87HizDPK9r8ft8RPPtYjMf7fEtjagEfEYQ3dRBbaGO Y8voxbrSNPk/Ly2k0iyD876+CLyWu0qPmS5sH6CWVdu4GtghuSQBtEszQpBt2ehK i2DmFCiy8Gu5cRiKZg677GZ8kEcZ1DSTRZS3i4x5jbvR6cB9DFFG5feuDdMpxyuy a7MOq2kAbyh8EA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddtgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpeetuddtgfdtfeejuefgffegveeuje etvdekhfffheefgeetjeevkeefgefflefgveenucffohhmrghinhepihguihhvrdhssgdp ghhnuhdrohhrghdpghgttgdrthgrrhhgvghtpdhlihgsudhfuhhntghsrdhssgenucfkph epjedurdefiedruddttddrvddvtdenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgr mhepmhgrihhlfhhrohhmpehgnhhusegurghnihgvlhgvnhhgvghlrdgtohhm X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id 66EAD108005B; Fri, 15 Jan 2021 06:31:43 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBVg4e023748; Fri, 15 Jan 2021 03:31:42 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 19/33] Import 32-bit division from the CM0 library Date: Fri, 15 Jan 2021 03:30:47 -0800 Message-Id: <6491e745f25dd0b0e0697aa2cae822521e6d5ac8.1610709584.git.gnu@danielengel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-13.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_ASCII_DIVIDERS, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" gcc/libgcc/ChangeLog: 2021-01-07 Daniel Engel * config/arm/eabi/idiv.S: New file for __udivsi3() and __divsi3(). * config/arm/lib1funcs.S: #include eabi/idiv.S (v6m only). --- libgcc/config/arm/eabi/idiv.S | 299 ++++++++++++++++++++++++++++++++++ libgcc/config/arm/lib1funcs.S | 19 ++- 2 files changed, 317 insertions(+), 1 deletion(-) create mode 100644 libgcc/config/arm/eabi/idiv.S diff --git a/libgcc/config/arm/eabi/idiv.S b/libgcc/config/arm/eabi/idiv.S new file mode 100644 index 00000000000..7381e8f57a3 --- /dev/null +++ b/libgcc/config/arm/eabi/idiv.S @@ -0,0 +1,299 @@ +/* div.S: Thumb-1 size-optimized 32-bit integer division + + Copyright (C) 2018-2021 Free Software Foundation, Inc. + Contributed by Daniel Engel, Senva Inc (gnu@danielengel.com) + + This file is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the + Free Software Foundation; either version 3, or (at your option) any + later version. + + This file is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + + +#ifndef __GNUC__ + +// int __aeabi_idiv0(int) +// Helper function for division by 0. +WEAK_START_SECTION aeabi_idiv0 .text.sorted.libgcc.idiv.idiv0 +FUNC_ALIAS cm0_idiv0 aeabi_idiv0 + CFI_START_FUNCTION + + #if defined(TRAP_EXCEPTIONS) && TRAP_EXCEPTIONS + svc #(SVC_DIVISION_BY_ZERO) + #endif + + RET + + CFI_END_FUNCTION +FUNC_END cm0_idiv0 +FUNC_END aeabi_idiv0 + +#endif /* !__GNUC__ */ + + +#ifdef L_divsi3 + +// int __aeabi_idiv(int, int) +// idiv_return __aeabi_idivmod(int, int) +// Returns signed $r0 after division by $r1. +// Also returns the signed remainder in $r1. +// Same parent section as __divsi3() to keep branches within range. +FUNC_START_SECTION divsi3 .text.sorted.libgcc.idiv.divsi3 + +#ifndef __symbian__ + FUNC_ALIAS aeabi_idiv divsi3 + FUNC_ALIAS aeabi_idivmod divsi3 +#endif + + CFI_START_FUNCTION + + // Extend signs. + asrs r2, r0, #31 + asrs r3, r1, #31 + + // Absolute value of the denominator, abort on division by zero. + eors r1, r3 + subs r1, r3 + #if defined(PEDANTIC_DIV0) && PEDANTIC_DIV0 + beq LLSYM(__idivmod_zero) + #else + beq SYM(__uidivmod_zero) + #endif + + // Absolute value of the numerator. + eors r0, r2 + subs r0, r2 + + // Keep the sign of the numerator in bit[31] (for the remainder). + // Save the XOR of the signs in bits[15:0] (for the quotient). + push { rT, lr } + .cfi_remember_state + .cfi_adjust_cfa_offset 8 + .cfi_rel_offset rT, 0 + .cfi_rel_offset lr, 4 + + lsrs rT, r3, #16 + eors rT, r2 + + // Handle division as unsigned. + bl SYM(__uidivmod_nonzero) __PLT__ + + // Set the sign of the remainder. + asrs r2, rT, #31 + eors r1, r2 + subs r1, r2 + + // Set the sign of the quotient. + sxth r3, rT + eors r0, r3 + subs r0, r3 + + LLSYM(__idivmod_return): + pop { rT, pc } + .cfi_restore_state + + #if defined(PEDANTIC_DIV0) && PEDANTIC_DIV0 + LLSYM(__idivmod_zero): + // Set up the *div0() parameter specified in the ARM runtime ABI: + // * 0 if the numerator is 0, + // * Or, the largest value of the type manipulated by the calling + // division function if the numerator is positive, + // * Or, the least value of the type manipulated by the calling + // division function if the numerator is negative. + subs r1, r0 + orrs r0, r1 + asrs r0, #31 + lsrs r0, #1 + eors r0, r2 + + // At least the __aeabi_idiv0() call is common. + b SYM(__uidivmod_zero2) + #endif /* PEDANTIC_DIV0 */ + + CFI_END_FUNCTION +FUNC_END divsi3 + +#ifndef __symbian__ + FUNC_END aeabi_idiv + FUNC_END aeabi_idivmod +#endif + +#endif /* L_divsi3 */ + + +#ifdef L_udivsi3 + +// int __aeabi_uidiv(unsigned int, unsigned int) +// idiv_return __aeabi_uidivmod(unsigned int, unsigned int) +// Returns unsigned $r0 after division by $r1. +// Also returns the remainder in $r1. +FUNC_START_SECTION udivsi3 .text.sorted.libgcc.idiv.udivsi3 + +#ifndef __symbian__ + FUNC_ALIAS aeabi_uidiv udivsi3 + FUNC_ALIAS aeabi_uidivmod udivsi3 +#endif + + CFI_START_FUNCTION + + // Abort on division by zero. + tst r1, r1 + #if defined(PEDANTIC_DIV0) && PEDANTIC_DIV0 + beq LLSYM(__uidivmod_zero) + #else + beq SYM(__uidivmod_zero) + #endif + + #if defined(OPTIMIZE_SPEED) && OPTIMIZE_SPEED + // MAYBE: Optimize division by a power of 2 + #endif + + // Public symbol for the sake of divsi3(). + FUNC_ENTRY uidivmod_nonzero + // Pre division: Shift the denominator as far as possible left + // without making it larger than the numerator. + // The loop is destructive, save a copy of the numerator. + mov ip, r0 + + // Set up binary search. + movs r3, #16 + movs r2, #1 + + LLSYM(__uidivmod_align): + // Prefer dividing the numerator to multipying the denominator + // (multiplying the denominator may result in overflow). + lsrs r0, r3 + cmp r0, r1 + blo LLSYM(__uidivmod_skip) + + // Multiply the denominator and the result together. + lsls r1, r3 + lsls r2, r3 + + LLSYM(__uidivmod_skip): + // Restore the numerator, and iterate until search goes to 0. + mov r0, ip + lsrs r3, #1 + bne LLSYM(__uidivmod_align) + + // In The result $r3 has been conveniently initialized to 0. + b LLSYM(__uidivmod_entry) + + LLSYM(__uidivmod_loop): + // Scale the denominator and the quotient together. + lsrs r1, #1 + lsrs r2, #1 + beq LLSYM(__uidivmod_return) + + LLSYM(__uidivmod_entry): + // Test if the denominator is smaller than the numerator. + cmp r0, r1 + blo LLSYM(__uidivmod_loop) + + // If the denominator is smaller, the next bit of the result is '1'. + // If the new remainder goes to 0, exit early. + adds r3, r2 + subs r0, r1 + bne LLSYM(__uidivmod_loop) + + LLSYM(__uidivmod_return): + mov r1, r0 + mov r0, r3 + RET + + #if defined(PEDANTIC_DIV0) && PEDANTIC_DIV0 + LLSYM(__uidivmod_zero): + // Set up the *div0() parameter specified in the ARM runtime ABI: + // * 0 if the numerator is 0, + // * Or, the largest value of the type manipulated by the calling + // division function if the numerator is positive. + subs r1, r0 + orrs r0, r1 + asrs r0, #31 + + FUNC_ENTRY uidivmod_zero2 + #if defined(DOUBLE_ALIGN_STACK) && DOUBLE_ALIGN_STACK + push { rT, lr } + .cfi_remember_state + .cfi_adjust_cfa_offset 8 + .cfi_rel_offset rT, 0 + .cfi_rel_offset lr, 4 + #else + push { lr } + .cfi_remember_state + .cfi_adjust_cfa_offset 4 + .cfi_rel_offset lr, 0 + #endif + + // Since GCC implements __aeabi_idiv0() as a weak overridable function, + // this call must be prepared for a jump beyond +/- 2 KB. + // NOTE: __aeabi_idiv0() can't be implemented as a tail call, since any + // non-trivial override will (likely) corrupt a remainder in $r1. + bl SYM(__aeabi_idiv0) __PLT__ + + // Since the input to __aeabi_idiv0() was INF, there really isn't any + // choice in which of the recommended *divmod() patterns to follow. + // Clear the remainder to complete {INF, 0}. + eors r1, r1 + + #if defined(DOUBLE_ALIGN_STACK) && DOUBLE_ALIGN_STACK + pop { rT, pc } + .cfi_restore_state + #else + pop { pc } + .cfi_restore_state + #endif + + #else /* !PEDANTIC_DIV0 */ + FUNC_ENTRY uidivmod_zero + // NOTE: The following code sets up a return pair of {0, numerator}, + // the second preference given by the ARM runtime ABI specification. + // The pedantic version is 18 bytes larger between __aeabi_idiv() and + // __aeabi_uidiv(). However, this version does not conform to the + // out-of-line parameter requirements given for __aeabi_idiv0(), and + // also does not pass 'gcc/testsuite/gcc.target/arm/divzero.c'. + + // Since the numerator may be overwritten by __aeabi_idiv0(), save now. + // Afterwards, it can be restored directly as the remainder. + push { r0, lr } + .cfi_remember_state + .cfi_adjust_cfa_offset 8 + .cfi_rel_offset r0, 0 + .cfi_rel_offset lr, 4 + + // Set up the quotient (not ABI compliant). + eors r0, r0 + + // Since GCC implements div0() as a weak overridable function, + // this call must be prepared for a jump beyond +/- 2 KB. + bl SYM(__aeabi_idiv0) __PLT__ + + // Restore the remainder and return. + pop { r1, pc } + .cfi_restore_state + + #endif /* !PEDANTIC_DIV0 */ + + CFI_END_FUNCTION +FUNC_END udivsi3 + +#ifndef __symbian__ + FUNC_END aeabi_uidiv + FUNC_END aeabi_uidivmod +#endif + +#endif /* L_udivsi3 */ + diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S index f41354f811e..5a7811808a9 100644 --- a/libgcc/config/arm/lib1funcs.S +++ b/libgcc/config/arm/lib1funcs.S @@ -1158,6 +1158,10 @@ LSYM(Ldivbyzero_negative): /* ------------------------------------------------------------------------ */ /* Start of the Real Functions */ /* ------------------------------------------------------------------------ */ + +/* Disable these on v6m in favor of 'eabi/idiv.S', below. */ +#ifndef NOT_ISA_TARGET_32BIT + #ifdef L_udivsi3 #if defined(__prefer_thumb__) @@ -1563,6 +1567,18 @@ LSYM(Lover12): DIV_FUNC_END modsi3 signed #endif /* L_modsi3 */ + +#else /* NOT_ISA_TARGET_32BIT */ +/* Temp registers. */ +#define rP r4 +#define rQ r5 +#define rS r6 +#define rT r7 + +#define PEDANTIC_DIV0 (1) +#include "eabi/idiv.S" +#endif /* NOT_ISA_TARGET_32BIT */ + /* ------------------------------------------------------------------------ */ #ifdef L_dvmd_tls @@ -1578,7 +1594,8 @@ LSYM(Lover12): FUNC_END div0 #endif -#endif /* L_divmodsi_tools */ +#endif /* L_div_tls */ + /* ------------------------------------------------------------------------ */ #ifdef L_dvmd_lnx @ GNU/Linux division-by zero handler. Used in place of L_dvmd_tls From patchwork Fri Jan 15 11:30:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426915 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=1oCrVd2D; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=mAcZqgpT; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJvR4wlWz9sSC for ; Fri, 15 Jan 2021 22:32:51 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0C0AF398241C; Fri, 15 Jan 2021 11:31:50 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id 306F3398240F for ; Fri, 15 Jan 2021 11:31:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 306F3398240F Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 45888F67; Fri, 15 Jan 2021 06:31:46 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 15 Jan 2021 06:31:46 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=H8C/3bdfRs6Dz R1/DJkvHEskWPiai/HtuPL/XezrwKM=; b=1oCrVd2DJGPPGU7JrtmecPWKtGUKh 8qfXcjZ/g1LNb4Y8ldUzbSQPd0hn8oevyBHBaAXw2vTyOibEKpGjmmePtiXsGxJG ik4KiB+W9LLiADOIplqfkZsDLc5QUTyehX7P0AKKCTUb5hamKEtm80hTqP32IuOO PEKcx8zYaIxfYSsdgR+V1jUV8B1pqedPczbzjw5GnbGgUPc1/IEaeE/jxAEm0l75 NiDMX/U2ekx5FvzaP4B0PkMU0biwp40orM7hghDQE6RPdLRE3qlQS9aWGDE+OVNo FUWnAdzXHw8hjdLRBvlYH20//nDWTG2TR/iUeY4Hk1BV7r2Fk6p1QfLsg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=H8C/3bdfRs6DzR1/DJkvHEskWPiai/HtuPL/XezrwKM=; b=mAcZqgpT /53ZlKS/EmcnF3TI3pVBSXhkV3DQ23i244ouU5+6CuxjTRZJApYWKCvOv2J0mUoU VurSnusvCqHDANn6cX8zCFYdMg14XAh+8OV9xezeQaPaOWROOyjHux4ftc/Mb66M ZUalufU4Gukg7+WW/oFUnOfZdaNpUYH1W6qUyXoBa9ngz2GoIA3t2DMc8Hgk/mY6 l93HKeSZNUYERslmmCJSO1gNpPoxQQTnpjr+3+JH0Gg1AnKPofj53gFFpZAsJn0F zpBmck380oOzeGhP4wO4Jk97tQP6ldjo5+IO3oO3BjgIFpXrxbfk16jRAhgats5R MOzec8fU64VimA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddtgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpeeiteeuhfegffejjedtgefgfeejfe eileejvdehueeutdevjeekgedvteegtdejleenucffohhmrghinhepsghprggsihdqvhei mhdrshgspdhlughivhdrshgspdhgnhhurdhorhhgpdhlihgsudhfuhhntghsrdhssgenuc fkphepjedurdefiedruddttddrvddvtdenucevlhhushhtvghrufhiiigvpedtnecurfgr rhgrmhepmhgrihhlfhhrohhmpehgnhhusegurghnihgvlhgvnhhgvghlrdgtohhm X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id 7257B108005C; Fri, 15 Jan 2021 06:31:45 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBViQY023751; Fri, 15 Jan 2021 03:31:44 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 20/33] Refactor Thumb-1 64-bit division into a new file Date: Fri, 15 Jan 2021 03:30:48 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_ASCII_DIVIDERS, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SCC_10_SHORT_WORD_LINES, SCC_5_SHORT_WORD_LINES, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/bpabi-v6m.S (__aeabi_ldivmod/ldivmod): Moved to ... * config/arm/eabi/ldiv.S: New file. * config/arm/lib1funcs.S: #include eabi/ldiv.S (v6m only). --- libgcc/config/arm/bpabi-v6m.S | 81 ------------------------- libgcc/config/arm/eabi/ldiv.S | 107 ++++++++++++++++++++++++++++++++++ libgcc/config/arm/lib1funcs.S | 1 + 3 files changed, 108 insertions(+), 81 deletions(-) create mode 100644 libgcc/config/arm/eabi/ldiv.S diff --git a/libgcc/config/arm/bpabi-v6m.S b/libgcc/config/arm/bpabi-v6m.S index a051c1530a4..b3dc3bf8f4d 100644 --- a/libgcc/config/arm/bpabi-v6m.S +++ b/libgcc/config/arm/bpabi-v6m.S @@ -34,87 +34,6 @@ #endif /* __ARM_EABI__ */ -.macro test_div_by_zero signed - cmp yyh, #0 - bne 7f - cmp yyl, #0 - bne 7f - cmp xxh, #0 - .ifc \signed, unsigned - bne 2f - cmp xxl, #0 -2: - beq 3f - movs xxh, #0 - mvns xxh, xxh @ 0xffffffff - movs xxl, xxh -3: - .else - blt 6f - bgt 4f - cmp xxl, #0 - beq 5f -4: movs xxl, #0 - mvns xxl, xxl @ 0xffffffff - lsrs xxh, xxl, #1 @ 0x7fffffff - b 5f -6: movs xxh, #0x80 - lsls xxh, xxh, #24 @ 0x80000000 - movs xxl, #0 -5: - .endif - @ tailcalls are tricky on v6-m. - push {r0, r1, r2} - ldr r0, 1f - adr r1, 1f - adds r0, r1 - str r0, [sp, #8] - @ We know we are not on armv4t, so pop pc is safe. - pop {r0, r1, pc} - .align 2 -1: - .word __aeabi_ldiv0 - 1b -7: -.endm - -#ifdef L_aeabi_ldivmod - -FUNC_START aeabi_ldivmod - test_div_by_zero signed - - push {r0, r1} - mov r0, sp - push {r0, lr} - ldr r0, [sp, #8] - bl SYM(__gnu_ldivmod_helper) - ldr r3, [sp, #4] - mov lr, r3 - add sp, sp, #8 - pop {r2, r3} - RET - FUNC_END aeabi_ldivmod - -#endif /* L_aeabi_ldivmod */ - -#ifdef L_aeabi_uldivmod - -FUNC_START aeabi_uldivmod - test_div_by_zero unsigned - - push {r0, r1} - mov r0, sp - push {r0, lr} - ldr r0, [sp, #8] - bl SYM(__udivmoddi4) - ldr r3, [sp, #4] - mov lr, r3 - add sp, sp, #8 - pop {r2, r3} - RET - FUNC_END aeabi_uldivmod - -#endif /* L_aeabi_uldivmod */ - #ifdef L_arm_addsubsf3 FUNC_START aeabi_frsub diff --git a/libgcc/config/arm/eabi/ldiv.S b/libgcc/config/arm/eabi/ldiv.S new file mode 100644 index 00000000000..3c8280ef580 --- /dev/null +++ b/libgcc/config/arm/eabi/ldiv.S @@ -0,0 +1,107 @@ +/* Miscellaneous BPABI functions. Thumb-1 implementation, suitable for ARMv4T, + ARMv6-M and ARMv8-M Baseline like ISA variants. + + Copyright (C) 2006-2020 Free Software Foundation, Inc. + Contributed by CodeSourcery. + + This file is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the + Free Software Foundation; either version 3, or (at your option) any + later version. + + This file is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + + +.macro test_div_by_zero signed + cmp yyh, #0 + bne 7f + cmp yyl, #0 + bne 7f + cmp xxh, #0 + .ifc \signed, unsigned + bne 2f + cmp xxl, #0 +2: + beq 3f + movs xxh, #0 + mvns xxh, xxh @ 0xffffffff + movs xxl, xxh +3: + .else + blt 6f + bgt 4f + cmp xxl, #0 + beq 5f +4: movs xxl, #0 + mvns xxl, xxl @ 0xffffffff + lsrs xxh, xxl, #1 @ 0x7fffffff + b 5f +6: movs xxh, #0x80 + lsls xxh, xxh, #24 @ 0x80000000 + movs xxl, #0 +5: + .endif + @ tailcalls are tricky on v6-m. + push {r0, r1, r2} + ldr r0, 1f + adr r1, 1f + adds r0, r1 + str r0, [sp, #8] + @ We know we are not on armv4t, so pop pc is safe. + pop {r0, r1, pc} + .align 2 +1: + .word __aeabi_ldiv0 - 1b +7: +.endm + +#ifdef L_aeabi_ldivmod + +FUNC_START aeabi_ldivmod + test_div_by_zero signed + + push {r0, r1} + mov r0, sp + push {r0, lr} + ldr r0, [sp, #8] + bl SYM(__gnu_ldivmod_helper) + ldr r3, [sp, #4] + mov lr, r3 + add sp, sp, #8 + pop {r2, r3} + RET + FUNC_END aeabi_ldivmod + +#endif /* L_aeabi_ldivmod */ + +#ifdef L_aeabi_uldivmod + +FUNC_START aeabi_uldivmod + test_div_by_zero unsigned + + push {r0, r1} + mov r0, sp + push {r0, lr} + ldr r0, [sp, #8] + bl SYM(__udivmoddi4) + ldr r3, [sp, #4] + mov lr, r3 + add sp, sp, #8 + pop {r2, r3} + RET + FUNC_END aeabi_uldivmod + +#endif /* L_aeabi_uldivmod */ + diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S index 5a7811808a9..97dd9f12e31 100644 --- a/libgcc/config/arm/lib1funcs.S +++ b/libgcc/config/arm/lib1funcs.S @@ -1577,6 +1577,7 @@ LSYM(Lover12): #define PEDANTIC_DIV0 (1) #include "eabi/idiv.S" +#include "eabi/ldiv.S" #endif /* NOT_ISA_TARGET_32BIT */ /* ------------------------------------------------------------------------ */ From patchwork Fri Jan 15 11:30:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426916 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=ckPqUn9k; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=MLntXjKg; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJvW6LrVz9sVr for ; Fri, 15 Jan 2021 22:32:55 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6E1BC3982413; Fri, 15 Jan 2021 11:31:53 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id 4E5DA3982413 for ; Fri, 15 Jan 2021 11:31:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 4E5DA3982413 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 5CDDCF54; Fri, 15 Jan 2021 06:31:48 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 15 Jan 2021 06:31:48 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=kVPGfTV+9IyXG unlthkkFuSmarhVunusQQ1lpqQBCNA=; b=ckPqUn9kQoavdkiJ8HcC/3x+UoQmn TuogfKm5bI5HF+9uVsuDlHdmm0E/PqfWN/sItZkVxXbr6++6YZcAde6qF7aRh1tJ +739W1ccTtzLjUbB2vll2rgkLVayoHGqY+wH/2x1Ni4bVGV5IR++AjXrLhNGbV7D Ot7A0la72SvAbsEA2j97NwEK6tFw4N1Wgvze/vH2hacXQEZQgl6ewOttK9PFcu4G oaM4qK8wjJtwTX4kvFwUX6lcxgQr9wso7MVrDpJtJYL4gByHh8FntTEdkZ/xSpmU 0aaYzgdikh9c3lMb+5SoBsbKQspkm8cJru8dErTNC56CHUizv91g7CwMA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=kVPGfTV+9IyXGunlthkkFuSmarhVunusQQ1lpqQBCNA=; b=MLntXjKg s4gsy6V+zkcYNIu+wKBkokx/m5+rup5r7yYx7VfpsM5C+pBAQGx+NMwdGzswiD3h meFfK8zYLEbG30AtOCGapn9M6JVtBNxmctfi544KFfwvT4+10kNwxIK5XYIWAuR2 GvBH2TbKl8ggzf4AnbyUxnrK3R2uHquVfwZUOsFXhkBXvLpFueevYewbmeFI+T8n EvcdjjgtSShBzHjyS9FX7+lmg6/14tjhxBBZyim9KUd20LurwlWSUR1vPSiy2mnA 2sItausa212u0mEN5+V6n2vGL5lf+AYej8oKwCVoNUN90vcsPVhYZ5EJvqidfi2k lQNAM3f7MH8vaA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddtgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpeehgfeijeetfeethfetvddtheduff ejueduffffjeeuuedutdffhfevgfdtveffheenucffohhmrghinhepghhnuhdrohhrghdp lhguihhvrdhssgdpghgttgdrthgrrhhgvghtnecukfhppeejuddrfeeirddutddtrddvvd dtnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepghhn uhesuggrnhhivghlvghnghgvlhdrtghomh X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id 95AEC108005B; Fri, 15 Jan 2021 06:31:47 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBVk8x023754; Fri, 15 Jan 2021 03:31:46 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 21/33] Import 64-bit division from the CM0 library Date: Fri, 15 Jan 2021 03:30:49 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SCC_10_SHORT_WORD_LINES, SCC_5_SHORT_WORD_LINES, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/bpabi.c: Deleted unused file. * config/arm/eabi/ldiv.S (__aeabi_ldivmod, __aeabi_uldivmod): Replaced wrapper functions with a complete implementation. * config/arm/t-bpabi (LIB2ADD_ST): Removed bpabi.c. * config/arm/t-elf (LIB1ASMFUNCS): Added _divdi3 and _udivdi3. --- libgcc/config/arm/bpabi.c | 42 --- libgcc/config/arm/eabi/ldiv.S | 542 +++++++++++++++++++++++++++++----- libgcc/config/arm/t-bpabi | 3 +- libgcc/config/arm/t-elf | 9 + 4 files changed, 474 insertions(+), 122 deletions(-) delete mode 100644 libgcc/config/arm/bpabi.c diff --git a/libgcc/config/arm/bpabi.c b/libgcc/config/arm/bpabi.c deleted file mode 100644 index bf6ba757964..00000000000 --- a/libgcc/config/arm/bpabi.c +++ /dev/null @@ -1,42 +0,0 @@ -/* Miscellaneous BPABI functions. - - Copyright (C) 2003-2021 Free Software Foundation, Inc. - Contributed by CodeSourcery, LLC. - - This file is free software; you can redistribute it and/or modify it - under the terms of the GNU General Public License as published by the - Free Software Foundation; either version 3, or (at your option) any - later version. - - This file is distributed in the hope that it will be useful, but - WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - . */ - -extern long long __divdi3 (long long, long long); -extern unsigned long long __udivdi3 (unsigned long long, - unsigned long long); -extern long long __gnu_ldivmod_helper (long long, long long, long long *); - - -long long -__gnu_ldivmod_helper (long long a, - long long b, - long long *remainder) -{ - long long quotient; - - quotient = __divdi3 (a, b); - *remainder = a - b * quotient; - return quotient; -} - diff --git a/libgcc/config/arm/eabi/ldiv.S b/libgcc/config/arm/eabi/ldiv.S index 3c8280ef580..c225e5973b2 100644 --- a/libgcc/config/arm/eabi/ldiv.S +++ b/libgcc/config/arm/eabi/ldiv.S @@ -1,8 +1,7 @@ -/* Miscellaneous BPABI functions. Thumb-1 implementation, suitable for ARMv4T, - ARMv6-M and ARMv8-M Baseline like ISA variants. +/* ldiv.S: Thumb-1 optimized 64-bit integer division - Copyright (C) 2006-2020 Free Software Foundation, Inc. - Contributed by CodeSourcery. + Copyright (C) 2018-2021 Free Software Foundation, Inc. + Contributed by Daniel Engel, Senva Inc (gnu@danielengel.com) This file is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the @@ -24,84 +23,471 @@ . */ -.macro test_div_by_zero signed - cmp yyh, #0 - bne 7f - cmp yyl, #0 - bne 7f - cmp xxh, #0 - .ifc \signed, unsigned - bne 2f - cmp xxl, #0 -2: - beq 3f - movs xxh, #0 - mvns xxh, xxh @ 0xffffffff - movs xxl, xxh -3: - .else - blt 6f - bgt 4f - cmp xxl, #0 - beq 5f -4: movs xxl, #0 - mvns xxl, xxl @ 0xffffffff - lsrs xxh, xxl, #1 @ 0x7fffffff - b 5f -6: movs xxh, #0x80 - lsls xxh, xxh, #24 @ 0x80000000 - movs xxl, #0 -5: - .endif - @ tailcalls are tricky on v6-m. - push {r0, r1, r2} - ldr r0, 1f - adr r1, 1f - adds r0, r1 - str r0, [sp, #8] - @ We know we are not on armv4t, so pop pc is safe. - pop {r0, r1, pc} - .align 2 -1: - .word __aeabi_ldiv0 - 1b -7: -.endm - -#ifdef L_aeabi_ldivmod - -FUNC_START aeabi_ldivmod - test_div_by_zero signed - - push {r0, r1} - mov r0, sp - push {r0, lr} - ldr r0, [sp, #8] - bl SYM(__gnu_ldivmod_helper) - ldr r3, [sp, #4] - mov lr, r3 - add sp, sp, #8 - pop {r2, r3} +#ifndef __GNUC__ + +// long long __aeabi_ldiv0(long long) +// Helper function for division by 0. +WEAK_START_SECTION aeabi_ldiv0 .text.sorted.libgcc.ldiv.ldiv0 + CFI_START_FUNCTION + + #if defined(TRAP_EXCEPTIONS) && TRAP_EXCEPTIONS + svc #(SVC_DIVISION_BY_ZERO) + #endif + RET - FUNC_END aeabi_ldivmod -#endif /* L_aeabi_ldivmod */ + CFI_END_FUNCTION +FUNC_END aeabi_ldiv0 -#ifdef L_aeabi_uldivmod +#endif /* !__GNUC__ */ -FUNC_START aeabi_uldivmod - test_div_by_zero unsigned - push {r0, r1} - mov r0, sp - push {r0, lr} - ldr r0, [sp, #8] - bl SYM(__udivmoddi4) - ldr r3, [sp, #4] - mov lr, r3 - add sp, sp, #8 - pop {r2, r3} - RET - FUNC_END aeabi_uldivmod +#ifdef L_divdi3 + +// long long __aeabi_ldiv(long long, long long) +// lldiv_return __aeabi_ldivmod(long long, long long) +// Returns signed $r1:$r0 after division by $r3:$r2. +// Also returns the remainder in $r3:$r2. +// Same parent section as __divsi3() to keep branches within range. +FUNC_START_SECTION divdi3 .text.sorted.libgcc.ldiv.divdi3 + +#ifndef __symbian__ + FUNC_ALIAS aeabi_ldiv divdi3 + FUNC_ALIAS aeabi_ldivmod divdi3 +#endif + + CFI_START_FUNCTION + + // Test the denominator for zero before pushing registers. + cmp yyl, #0 + bne LLSYM(__ldivmod_valid) + + cmp yyh, #0 + #if defined(PEDANTIC_DIV0) && PEDANTIC_DIV0 + beq LLSYM(__ldivmod_zero) + #else + beq SYM(__uldivmod_zero) + #endif + + LLSYM(__ldivmod_valid): + #if defined(DOUBLE_ALIGN_STACK) && DOUBLE_ALIGN_STACK + push { rP, rQ, rT, lr } + .cfi_remember_state + .cfi_adjust_cfa_offset 16 + .cfi_rel_offset rP, 0 + .cfi_rel_offset rQ, 4 + .cfi_rel_offset rT, 8 + .cfi_rel_offset lr, 12 + #else + push { rP, rQ, lr } + .cfi_remember_state + .cfi_adjust_cfa_offset 12 + .cfi_rel_offset rP, 0 + .cfi_rel_offset rQ, 4 + .cfi_rel_offset lr, 8 + #endif + + // Absolute value of the numerator. + asrs rP, xxh, #31 + eors xxl, rP + eors xxh, rP + subs xxl, rP + sbcs xxh, rP + + // Absolute value of the denominator. + asrs rQ, yyh, #31 + eors yyl, rQ + eors yyh, rQ + subs yyl, rQ + sbcs yyh, rQ + + // Keep the XOR of signs for the quotient. + eors rQ, rP + + // Handle division as unsigned. + bl SYM(__uldivmod_nonzero) __PLT__ + + // Set the sign of the quotient. + eors xxl, rQ + eors xxh, rQ + subs xxl, rQ + sbcs xxh, rQ + + // Set the sign of the remainder. + eors yyl, rP + eors yyh, rP + subs yyl, rP + sbcs yyh, rP + + LLSYM(__ldivmod_return): + #if defined(DOUBLE_ALIGN_STACK) && DOUBLE_ALIGN_STACK + pop { rP, rQ, rT, pc } + .cfi_restore_state + #else + pop { rP, rQ, pc } + .cfi_restore_state + #endif + + #if defined(PEDANTIC_DIV0) && PEDANTIC_DIV0 + LLSYM(__ldivmod_zero): + // Save the sign of the numerator. + asrs yyl, xxh, #31 + + // Set up the *div0() parameter specified in the ARM runtime ABI: + // * 0 if the numerator is 0, + // * Or, the largest value of the type manipulated by the calling + // division function if the numerator is positive, + // * Or, the least value of the type manipulated by the calling + // division function if the numerator is negative. + rsbs xxl, #0 + sbcs yyh, xxh + orrs xxh, yyh + asrs xxl, xxh, #31 + lsrs xxh, xxl, #1 + eors xxh, yyl + eors xxl, yyl + + // At least the __aeabi_ldiv0() call is common. + b SYM(__uldivmod_zero2) + #endif /* PEDANTIC_DIV0 */ + + CFI_END_FUNCTION +FUNC_END divdi3 + +#ifndef __symbian__ + FUNC_END aeabi_ldiv + FUNC_END aeabi_ldivmod +#endif + +#endif /* L_divdi3 */ + + +#ifdef L_udivdi3 + +// unsigned long long __aeabi_uldiv(unsigned long long, unsigned long long) +// ulldiv_return __aeabi_uldivmod(unsigned long long, unsigned long long) +// Returns unsigned $r1:$r0 after division by $r3:$r2. +// Also returns the remainder in $r3:$r2. +FUNC_START_SECTION udivdi3 .text.sorted.libgcc.ldiv.udivdi3 + +#ifndef __symbian__ + FUNC_ALIAS aeabi_uldiv udivdi3 + FUNC_ALIAS aeabi_uldivmod udivdi3 +#endif + + CFI_START_FUNCTION + + // Test the denominator for zero before changing the stack. + cmp yyh, #0 + bne SYM(__uldivmod_nonzero) + + cmp yyl, #0 + #if defined(PEDANTIC_DIV0) && PEDANTIC_DIV0 + beq LLSYM(__uldivmod_zero) + #else + beq SYM(__uldivmod_zero) + #endif + + #if defined(OPTIMIZE_SPEED) && OPTIMIZE_SPEED + // MAYBE: Optimize division by a power of 2 + #endif + + FUNC_ENTRY uldivmod_nonzero + push { rP, rQ, rT, lr } + .cfi_remember_state + .cfi_adjust_cfa_offset 16 + .cfi_rel_offset rP, 0 + .cfi_rel_offset rQ, 4 + .cfi_rel_offset rT, 8 + .cfi_rel_offset lr, 12 + + // Set up denominator shift, assuming a single width result. + movs rP, #32 + + // If the upper word of the denominator is 0 ... + tst yyh, yyh + bne LLSYM(__uldivmod_setup) + + #if !defined(__OPTIMIZE_SIZE__) || !__OPTIMIZE_SIZE__ + // ... and the upper word of the numerator is also 0, + // single width division will be at least twice as fast. + tst xxh, xxh + beq LLSYM(__uldivmod_small) + #endif + + // ... and the lower word of the denominator is less than or equal + // to the upper word of the numerator ... + cmp xxh, yyl + blo LLSYM(__uldivmod_setup) + + // ... then the result will be double width, at least 33 bits. + // Set up a flag in $rP to seed the shift for the second word. + movs yyh, yyl + eors yyl, yyl + adds rP, #64 + + LLSYM(__uldivmod_setup): + // Pre division: Shift the denominator as far as possible left + // without making it larger than the numerator. + // Since search is destructive, first save a copy of the numerator. + mov ip, xxl + mov lr, xxh + + // Set up binary search. + movs rQ, #16 + eors rT, rT + + LLSYM(__uldivmod_align): + // Maintain a secondary shift $rT = 32 - $rQ, making the overlapping + // shifts between low and high words easier to construct. + adds rT, rQ + + // Prefer dividing the numerator to multipying the denominator + // (multiplying the denominator may result in overflow). + lsrs xxh, rQ + + // Measure the high bits of denominator against the numerator. + cmp xxh, yyh + blo LLSYM(__uldivmod_skip) + bhi LLSYM(__uldivmod_shift) + + // If the high bits are equal, construct the low bits for checking. + mov xxh, lr + lsls xxh, rT + + lsrs xxl, rQ + orrs xxh, xxl + + cmp xxh, yyl + blo LLSYM(__uldivmod_skip) + + LLSYM(__uldivmod_shift): + // Scale the denominator and the result together. + subs rP, rQ + + // If the reduced numerator is still larger than or equal to the + // denominator, it is safe to shift the denominator left. + movs xxh, yyl + lsrs xxh, rT + lsls yyh, rQ + + lsls yyl, rQ + orrs yyh, xxh + + LLSYM(__uldivmod_skip): + // Restore the numerator. + mov xxl, ip + mov xxh, lr + + // Iterate until the shift goes to 0. + lsrs rQ, #1 + bne LLSYM(__uldivmod_align) + + // Initialize the result (zero). + mov ip, rQ + + // HACK: Compensate for the first word test. + lsls rP, #6 + + LLSYM(__uldivmod_word2): + // Is there another word? + lsrs rP, #6 + beq LLSYM(__uldivmod_return) + + // Shift the calculated result by 1 word. + mov lr, ip + mov ip, rQ + + // Set up the MSB of the next word of the quotient + movs rQ, #1 + rors rQ, rP + b LLSYM(__uldivmod_entry) + + LLSYM(__uldivmod_loop): + // Divide the denominator by 2. + // It could be slightly faster to multiply the numerator, + // but that would require shifting the remainder at the end. + lsls rT, yyh, #31 + lsrs yyh, #1 + lsrs yyl, #1 + adds yyl, rT + + // Step to the next bit of the result. + lsrs rQ, #1 + beq LLSYM(__uldivmod_word2) + + LLSYM(__uldivmod_entry): + // Test if the denominator is smaller, high byte first. + cmp xxh, yyh + blo LLSYM(__uldivmod_loop) + bhi LLSYM(__uldivmod_quotient) + + cmp xxl, yyl + blo LLSYM(__uldivmod_loop) + + LLSYM(__uldivmod_quotient): + // Smaller denominator: the next bit of the quotient will be set. + add ip, rQ + + // Subtract the denominator from the remainder. + // If the new remainder goes to 0, exit early. + subs xxl, yyl + sbcs xxh, yyh + bne LLSYM(__uldivmod_loop) + + tst xxl, xxl + bne LLSYM(__uldivmod_loop) + + #if !defined(__OPTIMIZE_SIZE__) || !__OPTIMIZE_SIZE__ + // Check whether there's still a second word to calculate. + lsrs rP, #6 + beq LLSYM(__uldivmod_return) + + // If so, shift the result left by a full word. + mov lr, ip + mov ip, xxh // zero + #else + eors rQ, rQ + b LLSYM(__uldivmod_word2) + #endif + + LLSYM(__uldivmod_return): + // Move the remainder to the second half of the result. + movs yyl, xxl + movs yyh, xxh + + // Move the quotient to the first half of the result. + mov xxl, ip + mov xxh, lr + + pop { rP, rQ, rT, pc } + .cfi_restore_state + + #if defined(PEDANTIC_DIV0) && PEDANTIC_DIV0 + LLSYM(__uldivmod_zero): + // Set up the *div0() parameter specified in the ARM runtime ABI: + // * 0 if the numerator is 0, + // * Or, the largest value of the type manipulated by the calling + // division function if the numerator is positive. + subs yyl, xxl + sbcs yyh, xxh + orrs xxh, yyh + asrs xxh, #31 + movs xxl, xxh + + FUNC_ENTRY uldivmod_zero2 + #if defined(DOUBLE_ALIGN_STACK) && DOUBLE_ALIGN_STACK + push { rT, lr } + .cfi_remember_state + .cfi_adjust_cfa_offset 8 + .cfi_rel_offset rT, 0 + .cfi_rel_offset lr, 4 + #else + push { lr } + .cfi_remember_state + .cfi_adjust_cfa_offset 4 + .cfi_rel_offset lr, 0 + #endif + + // Since GCC implements __aeabi_ldiv0() as a weak overridable function, + // this call must be prepared for a jump beyond +/- 2 KB. + // NOTE: __aeabi_ldiv0() can't be implemented as a tail call, since any + // non-trivial override will (likely) corrupt a remainder in $r3:$r2. + bl SYM(__aeabi_ldiv0) __PLT__ + + // Since the input to __aeabi_ldiv0() was INF, there really isn't any + // choice in which of the recommended *divmod() patterns to follow. + // Clear the remainder to complete {INF, 0}. + eors yyl, yyl + eors yyh, yyh + + #if defined(DOUBLE_ALIGN_STACK) && DOUBLE_ALIGN_STACK + pop { rT, pc } + .cfi_restore_state + #else + pop { pc } + .cfi_restore_state + #endif + + #else /* !PEDANTIC_DIV0 */ + FUNC_ENTRY uldivmod_zero + // NOTE: The following code sets up a return pair of {0, numerator}, + // the second preference given by the ARM runtime ABI specification. + // The pedantic version is 30 bytes larger between __aeabi_ldiv() and + // __aeabi_uldiv(). However, this version does not conform to the + // out-of-line parameter requirements given for __aeabi_ldiv0(), and + // also does not pass 'gcc/testsuite/gcc.target/arm/divzero.c'. + + // Since the numerator may be overwritten by __aeabi_ldiv0(), save now. + // Afterwards, they can be restored directly as the remainder. + #if defined(DOUBLE_ALIGN_STACK) && DOUBLE_ALIGN_STACK + push { r0, r1, rT, lr } + .cfi_remember_state + .cfi_adjust_cfa_offset 16 + .cfi_rel_offset xxl,0 + .cfi_rel_offset xxh,4 + .cfi_rel_offset rT, 8 + .cfi_rel_offset lr, 12 + #else + push { r0, r1, lr } + .cfi_remember_state + .cfi_adjust_cfa_offset 12 + .cfi_rel_offset xxl,0 + .cfi_rel_offset xxh,4 + .cfi_rel_offset lr, 8 + #endif + + // Set up the quotient. + eors xxl, xxl + eors xxh, xxh + + // Since GCC implements div0() as a weak overridable function, + // this call must be prepared for a jump beyond +/- 2 KB. + bl SYM(__aeabi_ldiv0) __PLT__ + + // Restore the remainder and return. + #if defined(DOUBLE_ALIGN_STACK) && DOUBLE_ALIGN_STACK + pop { r2, r3, rT, pc } + .cfi_restore_state + #else + pop { r2, r3, pc } + .cfi_restore_state + #endif + #endif /* !PEDANTIC_DIV0 */ + + #if !defined(__OPTIMIZE_SIZE__) || !__OPTIMIZE_SIZE__ + LLSYM(__uldivmod_small): + // Arrange operands for (much faster) 32-bit division. + #if defined(__ARMEB__) && __ARMEB__ + movs r0, r1 + movs r1, r3 + #else + movs r1, r2 + #endif + + bl SYM(__uidivmod_nonzero) __PLT__ + + // Arrange results back into 64-bit format. + #if defined(__ARMEB__) && __ARMEB__ + movs r3, r1 + movs r1, r0 + #else + movs r2, r1 + #endif + + // Extend quotient and remainder to 64 bits, unsigned. + eors xxh, xxh + eors yyh, yyh + pop { rP, rQ, rT, pc } + #endif + + CFI_END_FUNCTION +FUNC_END udivdi3 + +#ifndef __symbian__ + FUNC_END aeabi_uldiv + FUNC_END aeabi_uldivmod +#endif -#endif /* L_aeabi_uldivmod */ +#endif /* udivdi3 */ diff --git a/libgcc/config/arm/t-bpabi b/libgcc/config/arm/t-bpabi index dddddc7c444..86234d5676f 100644 --- a/libgcc/config/arm/t-bpabi +++ b/libgcc/config/arm/t-bpabi @@ -2,8 +2,7 @@ LIB1ASMFUNCS += _aeabi_lcmp _aeabi_ulcmp _aeabi_ldivmod _aeabi_uldivmod # Add the BPABI C functions. -LIB2ADD += $(srcdir)/config/arm/bpabi.c \ - $(srcdir)/config/arm/unaligned-funcs.c +LIB2ADD += $(srcdir)/config/arm/unaligned-funcs.c LIB2ADD_ST += $(srcdir)/config/arm/fp16.c diff --git a/libgcc/config/arm/t-elf b/libgcc/config/arm/t-elf index 83325410097..4d430325fa1 100644 --- a/libgcc/config/arm/t-elf +++ b/libgcc/config/arm/t-elf @@ -50,6 +50,15 @@ LIB1ASMFUNCS += \ _umodsi3 \ +ifeq (__ARM_ARCH_ISA_THUMB 1,$(ARM_ISA)$(THUMB1_ISA)) +# Group 1B: Integer functions built for v6m only. +LIB1ASMFUNCS += \ + _divdi3 \ + _udivdi3 \ + +endif + + # Group 2: Single precision floating point function objects. LIB1ASMFUNCS += \ _arm_addsubsf3 \ From patchwork Fri Jan 15 11:30:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426917 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=LisHWYGt; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=EKU8aZz4; dkim-atps=neutral Received: from sourceware.org (unknown [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJvc2x5Qz9sWc for ; Fri, 15 Jan 2021 22:33:00 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 62A943982429; Fri, 15 Jan 2021 11:31:55 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id 677A9398241F for ; Fri, 15 Jan 2021 11:31:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 677A9398241F Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 7D767F7D; Fri, 15 Jan 2021 06:31:50 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 15 Jan 2021 06:31:50 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=6h6WVA/dfQvDM Gtz60FxM3SkGr3/DeqGVkvDtte53Ok=; b=LisHWYGtcSrV1jsjrr0fmBNf1Fw9p sQGFDgeSi3hsIRAt6BxdTlyPePQfnRAliomKBOB7XCsXXISlCJDtaZ9KbA3bpqZv YT3JtPo6JF3gJvjYCk27k+vUdlrNhy2wPCGiyxlaK/OEajarjJFYjCX64v1ASDMy h9AbPjhAKOhqieICckXPi8RIjm3vBp0UxVNalo6TdZhLjBEdERqmg5sEf7FCkiNy k8igj8wHf77biSNiK4RdO/Eq9+LB1zoTVgGqOOyOToCjX4n2X97sDTXI/BcnVHnQ idrV8QW/mfppSSgYgj6ODul/k7K6GN+Oz5LSGNnrxk5WfxN5dAZcsPrkg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=6h6WVA/dfQvDMGtz60FxM3SkGr3/DeqGVkvDtte53Ok=; b=EKU8aZz4 EKWCmgpPgz5T9I8PmBeeiRVF7VIIsL/Qe4vivB/+l+jeWO1qqZpUlBs4M69uveIl vAmvmqZhJSCmb5es16ngehEE3ZQMx1l4p1yJ52KKuo097AY+DUp2EWboE2iJJT+Y QpCGcQOvoqlXUqDaiHcYUsUz1aphR58IoN+fDPLO6xUIcVajrebvINk6NiTdSW6a vHOFQSrWL4HavS1CGp1JBFrCEE8IKNuF6gJGgP3zKG+QNchltxe4WECCUqP1Nd/9 Wr4SzUXKsQ0ethlyhzdXpit5W0BB1Vx4D+k0El7mXSBwn1oM1f3DTxgHojzP+4gn 6fu4P3V/6q81kQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddtgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpeejffeileefgfffvdevheelgeduue euiefhueetteeuudeikefhfeeugffhueegheenucffohhmrghinheplhhmuhhlrdhssgdp ghhnuhdrohhrghdplhhisgdufhhunhgtshdrshgsnecukfhppeejuddrfeeirddutddtrd dvvddtnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhep ghhnuhesuggrnhhivghlvghnghgvlhdrtghomh X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id B60C61080066; Fri, 15 Jan 2021 06:31:49 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBVm5n023757; Fri, 15 Jan 2021 03:31:48 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 22/33] Import integer multiplication from the CM0 library Date: Fri, 15 Jan 2021 03:30:50 -0800 Message-Id: <15701b463680f1f6d985b9aad13987dc39681b5a.1610709584.git.gnu@danielengel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-13.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SCC_5_SHORT_WORD_LINES, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" gcc/libgcc/ChangeLog: 2021-01-07 Daniel Engel * config/arm/eabi/lmul.S: New file for __muldi3(), __mulsidi3(), and __umulsidi3(). * config/arm/lib1funcs.S: #eabi/lmul.S (v6m only). * config/arm/t-elf: Add the new objects to LIB1ASMFUNCS. --- libgcc/config/arm/eabi/lmul.S | 218 ++++++++++++++++++++++++++++++++++ libgcc/config/arm/lib1funcs.S | 1 + libgcc/config/arm/t-elf | 13 +- 3 files changed, 230 insertions(+), 2 deletions(-) create mode 100644 libgcc/config/arm/eabi/lmul.S diff --git a/libgcc/config/arm/eabi/lmul.S b/libgcc/config/arm/eabi/lmul.S new file mode 100644 index 00000000000..9fec4364a26 --- /dev/null +++ b/libgcc/config/arm/eabi/lmul.S @@ -0,0 +1,218 @@ +/* lmul.S: Thumb-1 optimized 64-bit integer multiplication + + Copyright (C) 2018-2021 Free Software Foundation, Inc. + Contributed by Daniel Engel, Senva Inc (gnu@danielengel.com) + + This file is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the + Free Software Foundation; either version 3, or (at your option) any + later version. + + This file is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + + +#ifdef L_muldi3 + +// long long __aeabi_lmul(long long, long long) +// Returns the least significant 64 bits of a 64 bit multiplication. +// Expects the two multiplicands in $r1:$r0 and $r3:$r2. +// Returns the product in $r1:$r0 (does not distinguish signed types). +// Uses $r4 and $r5 as scratch space. +// Same parent section as __umulsidi3() to keep tail call branch within range. +FUNC_START_SECTION muldi3 .text.sorted.libgcc.lmul.muldi3 + +#ifndef __symbian__ + FUNC_ALIAS aeabi_lmul muldi3 +#endif + + CFI_START_FUNCTION + + // $r1:$r0 = 0xDDDDCCCCBBBBAAAA + // $r3:$r2 = 0xZZZZYYYYXXXXWWWW + + // The following operations that only affect the upper 64 bits + // can be safely discarded: + // DDDD * ZZZZ + // DDDD * YYYY + // DDDD * XXXX + // CCCC * ZZZZ + // CCCC * YYYY + // BBBB * ZZZZ + + // MAYBE: Test for multiply by ZERO on implementations with a 32-cycle + // 'muls' instruction, and skip over the operation in that case. + + // (0xDDDDCCCC * 0xXXXXWWWW), free $r1 + muls xxh, yyl + + // (0xZZZZYYYY * 0xBBBBAAAA), free $r3 + muls yyh, xxl + adds yyh, xxh + + // Put the parameters in the correct form for umulsidi3(). + movs xxh, yyl + b LLSYM(__mul_overflow) + + CFI_END_FUNCTION +FUNC_END muldi3 + +#ifndef __symbian__ + FUNC_END aeabi_lmul +#endif + +#endif /* L_muldi3 */ + + +// The following implementation of __umulsidi3() integrates with __muldi3() +// above to allow the fast tail call while still preserving the extra +// hi-shifted bits of the result. However, these extra bits add a few +// instructions not otherwise required when using only __umulsidi3(). +// Therefore, this block configures __umulsidi3() for compilation twice. +// The first version is a minimal standalone implementation, and the second +// version adds the hi bits of __muldi3(). The standalone version must +// be declared WEAK, so that the combined version can supersede it and +// provide both symbols in programs that multiply long doubles. +// This means '_umulsidi3' should appear before '_muldi3' in LIB1ASMFUNCS. +#if defined(L_muldi3) || defined(L_umulsidi3) + +#ifdef L_umulsidi3 +// unsigned long long __umulsidi3(unsigned int, unsigned int) +// Returns all 64 bits of a 32 bit multiplication. +// Expects the two multiplicands in $r0 and $r1. +// Returns the product in $r1:$r0. +// Uses $r3, $r4 and $ip as scratch space. +WEAK_START_SECTION umulsidi3 .text.sorted.libgcc.lmul.umulsidi3 + CFI_START_FUNCTION + +#else /* L_muldi3 */ +FUNC_ENTRY umulsidi3 + CFI_START_FUNCTION + + // 32x32 multiply with 64 bit result. + // Expand the multiply into 4 parts, since muls only returns 32 bits. + // (a16h * b16h / 2^32) + // + (a16h * b16l / 2^48) + (a16l * b16h / 2^48) + // + (a16l * b16l / 2^64) + + // MAYBE: Test for multiply by 0 on implementations with a 32-cycle + // 'muls' instruction, and skip over the operation in that case. + + eors yyh, yyh + + LLSYM(__mul_overflow): + mov ip, yyh + +#endif /* !L_muldi3 */ + + // a16h * b16h + lsrs r2, xxl, #16 + lsrs r3, xxh, #16 + muls r2, r3 + + #ifdef L_muldi3 + add ip, r2 + #else + mov ip, r2 + #endif + + // a16l * b16h; save a16h first! + lsrs r2, xxl, #16 + #if (__ARM_ARCH >= 6) + uxth xxl, xxl + #else /* __ARM_ARCH < 6 */ + lsls xxl, #16 + lsrs xxl, #16 + #endif + muls r3, xxl + + // a16l * b16l + #if (__ARM_ARCH >= 6) + uxth xxh, xxh + #else /* __ARM_ARCH < 6 */ + lsls xxh, #16 + lsrs xxh, #16 + #endif + muls xxl, xxh + + // a16h * b16l + muls xxh, r2 + + // Distribute intermediate results. + eors r2, r2 + adds xxh, r3 + adcs r2, r2 + lsls r3, xxh, #16 + lsrs xxh, #16 + lsls r2, #16 + adds xxl, r3 + adcs xxh, r2 + + // Add in the high bits. + add xxh, ip + + RET + + CFI_END_FUNCTION +FUNC_END umulsidi3 + +#endif /* L_muldi3 || L_umulsidi3 */ + + +#ifdef L_mulsidi3 + +// long long mulsidi3(int, int) +// Returns all 64 bits of a 32 bit signed multiplication. +// Expects the two multiplicands in $r0 and $r1. +// Returns the product in $r1:$r0. +// Uses $r3, $r4 and $rT as scratch space. +FUNC_START_SECTION mulsidi3 .text.sorted.libgcc.lmul.mulsidi3 + CFI_START_FUNCTION + + // Push registers for function call. + push { rT, lr } + .cfi_remember_state + .cfi_adjust_cfa_offset 8 + .cfi_rel_offset rT, 0 + .cfi_rel_offset lr, 4 + + // Save signs of the arguments. + asrs r3, r0, #31 + asrs rT, r1, #31 + + // Absolute value of the arguments. + eors r0, r3 + eors r1, rT + subs r0, r3 + subs r1, rT + + // Save sign of the result. + eors rT, r3 + + bl SYM(__umulsidi3) __PLT__ + + // Apply sign of the result. + eors xxl, rT + eors xxh, rT + subs xxl, rT + sbcs xxh, rT + + pop { rT, pc } + .cfi_restore_state + + CFI_END_FUNCTION +FUNC_END mulsidi3 + +#endif /* L_mulsidi3 */ + diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S index 97dd9f12e31..dc34ea76b15 100644 --- a/libgcc/config/arm/lib1funcs.S +++ b/libgcc/config/arm/lib1funcs.S @@ -1578,6 +1578,7 @@ LSYM(Lover12): #define PEDANTIC_DIV0 (1) #include "eabi/idiv.S" #include "eabi/ldiv.S" +#include "eabi/lmul.S" #endif /* NOT_ISA_TARGET_32BIT */ /* ------------------------------------------------------------------------ */ diff --git a/libgcc/config/arm/t-elf b/libgcc/config/arm/t-elf index 4d430325fa1..eb1acd8d5a2 100644 --- a/libgcc/config/arm/t-elf +++ b/libgcc/config/arm/t-elf @@ -27,6 +27,13 @@ LIB1ASMFUNCS += \ _paritysi2 \ _popcountsi2 \ +ifeq (__ARM_ARCH_ISA_THUMB 1,$(ARM_ISA)$(THUMB1_ISA)) +# Group 0B: WEAK overridable function objects built for v6m only. +LIB1ASMFUNCS += \ + _muldi3 \ + +endif + # Group 1: Integer function objects. LIB1ASMFUNCS += \ @@ -51,11 +58,13 @@ LIB1ASMFUNCS += \ ifeq (__ARM_ARCH_ISA_THUMB 1,$(ARM_ISA)$(THUMB1_ISA)) -# Group 1B: Integer functions built for v6m only. +# Group 1B: Integer function objects built for v6m only. LIB1ASMFUNCS += \ _divdi3 \ _udivdi3 \ - + _mulsidi3 \ + _umulsidi3 \ + endif From patchwork Fri Jan 15 11:30:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426918 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=3yGkBG57; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=KvPzGRT5; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJvj0wJ8z9sWj for ; Fri, 15 Jan 2021 22:33:04 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E1922398242F; Fri, 15 Jan 2021 11:31:56 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id 91BFD3982421 for ; Fri, 15 Jan 2021 11:31:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 91BFD3982421 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id A26B8EBC; Fri, 15 Jan 2021 06:31:52 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 15 Jan 2021 06:31:52 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=p92Q2ojHBlQV3 j+pLTZrV4faOepDyC1rYOIfwj/aWHA=; b=3yGkBG57EBeEVMbaanADpxAQtEqif A7UhqZP/nC+tXH+PUqDNIMUUwnaRmDUOMYpLZ7KkHlQMAW5EiJXVltX36+SBB2sE ixDtgFnVEhHa1yE5AP/T3R7DMYeYaeIrX0/lRAeirPawylypqNtBbj6p8VH6+qvY UIen3nzfVioNxCb3DwCpTwJS9fc6pEeLfXdV1rkySzdZy7fDjp/23khoOe6ElkPK NPjiREc2wj0nrzIYL/yEHEWT1rV1Efruy+bjNyIvbkOks8bvjT6gXaUH8DQJ6Ahh H4y/5PjnXV7OakDDcLGvOD+3eddJlMu/XKw1kdcTOI91Wr0QL3cDTBr2w== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=p92Q2ojHBlQV3j+pLTZrV4faOepDyC1rYOIfwj/aWHA=; b=KvPzGRT5 jO9WLVo9JPkUvKUh4GSjEpmDxLrTHI4eNzOVBEcnV9/qhQR4s/k99p/h9XL3yDUQ GYna66JkHRCyAQkV5GgUEAF/E9k9ct0++Jmr1vheHg/4RXioz4QafRx3iwKnN9Se LCIoH3fWEeDXHv6NVmOmXSu8wt+bN9v5bNe1rBrqQsvoDcRLyVa8PvlFd/ha8KOZ BoxzklDRqQgpL0TM7iA5JaJwwrXNwLPnaiP7PWymR6eFftaQY2wohLxylsnepv6w tFu2Mch8t1612ANQBtdBp7ktFou5X271JRr6Ojw7nfyzQcaWH71/6lBpXwSfxCu0 KtcTz9F+QB4xoQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddtgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpeelgfekfedtkefhvddvvdevjeeutd efkeevueekhfffuedtudfgueeggfevhfelueenucffohhmrghinhepsghprggsihdqvhei mhdrshgspdhftghmphdrshgspdhgnhhurdhorhhgpdhlihgsudhfuhhntghsrdhssgenuc fkphepjedurdefiedruddttddrvddvtdenucevlhhushhtvghrufhiiigvpedtnecurfgr rhgrmhepmhgrihhlfhhrohhmpehgnhhusegurghnihgvlhgvnhhgvghlrdgtohhm X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id D0BEF108005B; Fri, 15 Jan 2021 06:31:51 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBVoah023760; Fri, 15 Jan 2021 03:31:50 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 23/33] Refactor Thumb-1 float comparison into a new file Date: Fri, 15 Jan 2021 03:30:51 -0800 Message-Id: <5054b7e85b8598815aa485a03c8dd60ebbcb31cb.1610709584.git.gnu@danielengel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-13.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/bpabi-v6m.S (__aeabi_cfcmpeq, __aeabi_cfcmple, __aeabi_cfrcmple, __aeabi_fcmpeq, __aeabi_fcmple, aeabi_fcmple, __aeabi_fcmpgt, aeabi_fcmpge): Moved to ... * config/arm/eabi/fcmp.S: New file. * config/arm/lib1funcs.S: #include eabi/fcmp.S (v6m only). --- libgcc/config/arm/bpabi-v6m.S | 63 ------------------------- libgcc/config/arm/eabi/fcmp.S | 89 +++++++++++++++++++++++++++++++++++ libgcc/config/arm/lib1funcs.S | 1 + 3 files changed, 90 insertions(+), 63 deletions(-) create mode 100644 libgcc/config/arm/eabi/fcmp.S diff --git a/libgcc/config/arm/bpabi-v6m.S b/libgcc/config/arm/bpabi-v6m.S index b3dc3bf8f4d..7c874f06218 100644 --- a/libgcc/config/arm/bpabi-v6m.S +++ b/libgcc/config/arm/bpabi-v6m.S @@ -49,69 +49,6 @@ FUNC_START aeabi_frsub #endif /* L_arm_addsubsf3 */ -#ifdef L_arm_cmpsf2 - -FUNC_START aeabi_cfrcmple - - mov ip, r0 - movs r0, r1 - mov r1, ip - b 6f - -FUNC_START aeabi_cfcmpeq -FUNC_ALIAS aeabi_cfcmple aeabi_cfcmpeq - - @ The status-returning routines are required to preserve all - @ registers except ip, lr, and cpsr. -6: push {r0, r1, r2, r3, r4, lr} - bl __lesf2 - @ Set the Z flag correctly, and the C flag unconditionally. - cmp r0, #0 - @ Clear the C flag if the return value was -1, indicating - @ that the first operand was smaller than the second. - bmi 1f - movs r1, #0 - cmn r0, r1 -1: - pop {r0, r1, r2, r3, r4, pc} - - FUNC_END aeabi_cfcmple - FUNC_END aeabi_cfcmpeq - FUNC_END aeabi_cfrcmple - -FUNC_START aeabi_fcmpeq - - push {r4, lr} - bl __eqsf2 - negs r0, r0 - adds r0, r0, #1 - pop {r4, pc} - - FUNC_END aeabi_fcmpeq - -.macro COMPARISON cond, helper, mode=sf2 -FUNC_START aeabi_fcmp\cond - - push {r4, lr} - bl __\helper\mode - cmp r0, #0 - b\cond 1f - movs r0, #0 - pop {r4, pc} -1: - movs r0, #1 - pop {r4, pc} - - FUNC_END aeabi_fcmp\cond -.endm - -COMPARISON lt, le -COMPARISON le, le -COMPARISON gt, ge -COMPARISON ge, ge - -#endif /* L_arm_cmpsf2 */ - #ifdef L_arm_addsubdf3 FUNC_START aeabi_drsub diff --git a/libgcc/config/arm/eabi/fcmp.S b/libgcc/config/arm/eabi/fcmp.S new file mode 100644 index 00000000000..96d627f1fea --- /dev/null +++ b/libgcc/config/arm/eabi/fcmp.S @@ -0,0 +1,89 @@ +/* Miscellaneous BPABI functions. Thumb-1 implementation, suitable for ARMv4T, + ARMv6-M and ARMv8-M Baseline like ISA variants. + + Copyright (C) 2006-2020 Free Software Foundation, Inc. + Contributed by CodeSourcery. + + This file is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the + Free Software Foundation; either version 3, or (at your option) any + later version. + + This file is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + + +#ifdef L_arm_cmpsf2 + +FUNC_START aeabi_cfrcmple + + mov ip, r0 + movs r0, r1 + mov r1, ip + b 6f + +FUNC_START aeabi_cfcmpeq +FUNC_ALIAS aeabi_cfcmple aeabi_cfcmpeq + + @ The status-returning routines are required to preserve all + @ registers except ip, lr, and cpsr. +6: push {r0, r1, r2, r3, r4, lr} + bl __lesf2 + @ Set the Z flag correctly, and the C flag unconditionally. + cmp r0, #0 + @ Clear the C flag if the return value was -1, indicating + @ that the first operand was smaller than the second. + bmi 1f + movs r1, #0 + cmn r0, r1 +1: + pop {r0, r1, r2, r3, r4, pc} + + FUNC_END aeabi_cfcmple + FUNC_END aeabi_cfcmpeq + FUNC_END aeabi_cfrcmple + +FUNC_START aeabi_fcmpeq + + push {r4, lr} + bl __eqsf2 + negs r0, r0 + adds r0, r0, #1 + pop {r4, pc} + + FUNC_END aeabi_fcmpeq + +.macro COMPARISON cond, helper, mode=sf2 +FUNC_START aeabi_fcmp\cond + + push {r4, lr} + bl __\helper\mode + cmp r0, #0 + b\cond 1f + movs r0, #0 + pop {r4, pc} +1: + movs r0, #1 + pop {r4, pc} + + FUNC_END aeabi_fcmp\cond +.endm + +COMPARISON lt, le +COMPARISON le, le +COMPARISON gt, ge +COMPARISON ge, ge + +#endif /* L_arm_cmpsf2 */ + diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S index dc34ea76b15..2f18eb68cba 100644 --- a/libgcc/config/arm/lib1funcs.S +++ b/libgcc/config/arm/lib1funcs.S @@ -2010,6 +2010,7 @@ LSYM(Lchange_\register): #include "bpabi.S" #else /* NOT_ISA_TARGET_32BIT */ #include "bpabi-v6m.S" +#include "eabi/fcmp.S" #endif /* NOT_ISA_TARGET_32BIT */ #include "eabi/lcmp.S" #endif /* !__symbian__ */ From patchwork Fri Jan 15 11:30:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426919 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=WNjFW1vd; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=g/j3zuil; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJvn2wW9z9sWk for ; Fri, 15 Jan 2021 22:33:09 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 59ECB3982435; Fri, 15 Jan 2021 11:32:00 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id E1E48398242D for ; Fri, 15 Jan 2021 11:31:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org E1E48398242D Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id F1016F7C; Fri, 15 Jan 2021 06:31:54 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 15 Jan 2021 06:31:55 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=+AAnY51xjVzg7 /r9BvB4CLgleabr4wT1vvE82BT0vp8=; b=WNjFW1vdR4Oer1Bj/CHn0RgfkaKCx nuS9yfJK3tRGRVeIizXZw//NnOp+7/AuQI5dTpHeH1hnGSUmU86cpvm06+AHWHdb veIjXWqzOKdeU4IRf3Lyc5EY6ENE5Lm7FcJ5hk8YC2pWL64GCUkRqucB5J4Z0YMf 5x5axdpA8Bh0UjbN/Wj+MG4Cwj4olzlS7cay67+C3/eOByHzzaR5QSgK6rkbqV3L cgHBO+EO5vz8AnZ6WZZ/KedPNWFCx6FH8TtG5bUg3rbjEQ1NN15xdhku5FxS2giK wK4zOF5f8J9ilO1QHknrM//Wcjk/utCbM8v5mrjA1ladm4k/YvbfSYWoQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=+AAnY51xjVzg7/r9BvB4CLgleabr4wT1vvE82BT0vp8=; b=g/j3zuil ECEqSiBtcfuFE23mXo8ojRhL+QJDB+N6zF5ZWrX6s5pW2IVLPvof3mBXWY5EVvbq XP8zfaMoaqAD1gVn4tEHVoZTQWtUBTvRkiDU9CqwYC1dh75A5sj7nV8ieqyMXI7u zgjPJcFXXCqkkm6eUbBFcROkgVs6AhIic8DhPQZWtknNUlBtOt8wcGEPnAMWeG0/ Oe+btonZroIJnQamw3ljTG9WUPokW4egB90mz/8Ao5nfbTfBc7vprBNy9Cwgs6cc dGKwjSqGna9UKM/fP+6u35WLO7bII/gW+GiI+Id/6udisqMwFkBb7MZKIS6mTzLv 4XlAsnFc7rvajQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddtgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpeeuudffvddtkeffffefieeikedugf ejudduudejteeuheehudfhheejleelhfegudenucffohhmrghinhepfhgtmhhprdhssgdp ghhnuhdrohhrghdplhhisgdufhhunhgtshdrshgsnecukfhppeejuddrfeeirddutddtrd dvvddtnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhep ghhnuhesuggrnhhivghlvghnghgvlhdrtghomh X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id 03E67108005B; Fri, 15 Jan 2021 06:31:53 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBVqK2023763; Fri, 15 Jan 2021 03:31:52 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 24/33] Import float comparison from the CM0 library Date: Fri, 15 Jan 2021 03:30:52 -0800 Message-Id: <51670493672182338dcea5b1452840bd7fe8597b.1610709584.git.gnu@danielengel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-13.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" These functions are significantly smaller and faster than the wrapper functions and soft-float implementation they replace. Using the first comparison operator (e.g. '<=') in any program costs about 70 bytes initially, but every additional operator incrementally adds just 4 bytes. NOTE: It seems that the __aeabi_cfcmp*() routines formerly in bpabi-v6m.S were not well tested, as they returned wrong results for the 'C' flag. The replacement functions are fully tested. gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/eabi/fcmp.S (__cmpsf2, __eqsf2, __gesf2, __aeabi_fcmpne, __aeabi_fcmpun): Added new functions. (__aeabi_fcmpeq, __aeabi_fcmpne, __aeabi_fcmplt, __aeabi_fcmple, __aeabi_fcmpge, __aeabi_fcmpgt, __aeabi_cfcmple, __aeabi_cfcmpeq, __aeabi_cfrcmple): Replaced with branches to __internal_cmpsf2(). * config/arm/eabi/fplib.h: New file with fcmp-specific constants and general build configuration macros. * config/arm/lib1funcs.S: #include eabi/fplib.h (v6m only). * config/arm/t-elf (LIB1ASMFUNCS): Added _internal_cmpsf2, _arm_cfcmpeq, _arm_cfcmple, _arm_cfrcmple, _arm_fcmpeq, _arm_fcmpge, _arm_fcmpgt, _arm_fcmple, _arm_fcmplt, _arm_fcmpne, _arm_eqsf2, and _arm_gesf2. --- libgcc/config/arm/eabi/fcmp.S | 643 +++++++++++++++++++++++++++++---- libgcc/config/arm/eabi/fplib.h | 83 +++++ libgcc/config/arm/lib1funcs.S | 1 + libgcc/config/arm/t-elf | 18 + 4 files changed, 681 insertions(+), 64 deletions(-) create mode 100644 libgcc/config/arm/eabi/fplib.h diff --git a/libgcc/config/arm/eabi/fcmp.S b/libgcc/config/arm/eabi/fcmp.S index 96d627f1fea..cada33f4d35 100644 --- a/libgcc/config/arm/eabi/fcmp.S +++ b/libgcc/config/arm/eabi/fcmp.S @@ -1,8 +1,7 @@ -/* Miscellaneous BPABI functions. Thumb-1 implementation, suitable for ARMv4T, - ARMv6-M and ARMv8-M Baseline like ISA variants. +/* fcmp.S: Thumb-1 optimized 32-bit float comparison - Copyright (C) 2006-2020 Free Software Foundation, Inc. - Contributed by CodeSourcery. + Copyright (C) 2018-2021 Free Software Foundation, Inc. + Contributed by Daniel Engel, Senva Inc (gnu@danielengel.com) This file is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the @@ -24,66 +23,582 @@ . */ +// The various compare functions in this file all expect to tail call __cmpsf2() +// with flags set for a particular comparison mode. The __internal_cmpsf2() +// symbol itself is unambiguous, but there is a remote risk that the linker +// will prefer some other symbol in place of __cmpsf2(). Importing an archive +// file that also exports __cmpsf2() will throw an error in this case. +// As a workaround, this block configures __aeabi_f2lz() for compilation twice. +// The first version configures __internal_cmpsf2() as a WEAK standalone symbol, +// and the second exports __cmpsf2() and __internal_cmpsf2() normally. +// A small bonus: programs not using __cmpsf2() itself will be slightly smaller. +// 'L_internal_cmpsf2' should appear before 'L_arm_cmpsf2' in LIB1ASMFUNCS. +#if defined(L_arm_cmpsf2) || defined(L_internal_cmpsf2) + +#define CMPSF2_SECTION .text.sorted.libgcc.fcmp.cmpsf2 + +// int __cmpsf2(float, float) +// +// Returns the three-way comparison result of $r0 with $r1: +// * +1 if ($r0 > $r1), or either argument is NAN +// * 0 if ($r0 == $r1) +// * -1 if ($r0 < $r1) +// Uses $r2, $r3, and $ip as scratch space. +#ifdef L_arm_cmpsf2 +FUNC_START_SECTION cmpsf2 CMPSF2_SECTION +FUNC_ALIAS lesf2 cmpsf2 +FUNC_ALIAS ltsf2 cmpsf2 + CFI_START_FUNCTION + + // Assumption: The 'libgcc' functions should raise exceptions. + movs r2, #(FCMP_UN_POSITIVE + FCMP_RAISE_EXCEPTIONS + FCMP_3WAY) + + // int,int __internal_cmpsf2(float, float, int) + // Internal function expects a set of control flags in $r2. + // If ordered, returns a comparison type { 0, 1, 2 } in $r3 + FUNC_ENTRY internal_cmpsf2 + +#else /* L_internal_cmpsf2 */ + WEAK_START_SECTION internal_cmpsf2 CMPSF2_SECTION + CFI_START_FUNCTION + +#endif + + // When operand signs are considered, the comparison result falls + // within one of the following quadrants: + // + // $r0 $r1 $r0-$r1* flags result + // + + > C=0 GT + // + + = Z=1 EQ + // + + < C=1 LT + // + - > C=1 GT + // + - = C=1 GT + // + - < C=1 GT + // - + > C=0 LT + // - + = C=0 LT + // - + < C=0 LT + // - - > C=0 LT + // - - = Z=1 EQ + // - - < C=1 GT + // + // *When interpeted as a subtraction of unsigned integers + // + // From the table, it is clear that in the presence of any negative + // operand, the natural result simply needs to be reversed. + // Save the 'N' flag for later use. + movs r3, r0 + orrs r3, r1 + mov ip, r3 + + // Keep the absolute value of the second argument for NAN testing. + lsls r3, r1, #1 + + // With the absolute value of the second argument safely stored, + // recycle $r1 to calculate the difference of the arguments. + subs r1, r0, r1 + + // Save the 'C' flag for use later. + // Effectively shifts all the flags 1 bit left. + adcs r2, r2 + + // Absolute value of the first argument. + lsls r0, #1 + + // Identify the largest absolute value between the two arguments. + cmp r0, r3 + bhs LLSYM(__fcmp_sorted) + + // Keep the larger absolute value for NAN testing. + // NOTE: When the arguments are respectively a signaling NAN and a + // quiet NAN, the quiet NAN has precedence. This has consequences + // if TRAP_NANS is enabled, but the flags indicate that exceptions + // for quiet NANs should be suppressed. After the signaling NAN is + // discarded, no exception is raised, although it should have been. + // This could be avoided by using a fifth register to save both + // arguments until the signaling bit can be tested, but that seems + // like an excessive amount of ugly code for an ambiguous case. + movs r0, r3 + + LLSYM(__fcmp_sorted): + // If $r3 is NAN, the result is unordered. + movs r3, #255 + lsls r3, #24 + cmp r0, r3 + bhi LLSYM(__fcmp_unordered) + + // Positive and negative zero must be considered equal. + // If the larger absolute value is +/-0, both must have been +/-0. + subs r3, r0, #0 + beq LLSYM(__fcmp_zero) + + // Test for regular equality. + subs r3, r1, #0 + beq LLSYM(__fcmp_zero) + + // Isolate the saved 'C', and invert if either argument was negative. + // Remembering that the original subtraction was $r1 - $r0, + // the result will be 1 if 'C' was set (gt), or 0 for not 'C' (lt). + lsls r3, r2, #31 + add r3, ip + lsrs r3, #31 + + // HACK: Force the 'C' bit clear, + // since bit[30] of $r3 may vary with the operands. + adds r3, #0 + + LLSYM(__fcmp_zero): + // After everything is combined, the temp result will be + // 2 (gt), 1 (eq), or 0 (lt). + adcs r3, r3 + + // Short-circuit return if the 3-way comparison flag is set. + // Otherwise, shifts the condition mask into bits[2:0]. + lsrs r2, #2 + bcs LLSYM(__fcmp_return) + + // If the bit corresponding to the comparison result is set in the + // accepance mask, a '1' will fall out into the result. + movs r0, #1 + lsrs r2, r3 + ands r0, r2 + RET + + LLSYM(__fcmp_unordered): + // Set up the requested UNORDERED result. + // Remember the shift in the flags (above). + lsrs r2, #6 + + #if defined(TRAP_EXCEPTIONS) && TRAP_EXCEPTIONS + // TODO: ... The + + + #endif + + #if defined(TRAP_NANS) && TRAP_NANS + // Always raise an exception if FCMP_RAISE_EXCEPTIONS was specified. + bcs LLSYM(__fcmp_trap) + + // If FCMP_NO_EXCEPTIONS was specified, no exceptions on quiet NANs. + // The comparison flags are moot, so $r1 can serve as scratch space. + lsrs r1, r0, #24 + bcs LLSYM(__fcmp_return2) + + LLSYM(__fcmp_trap): + // Restore the NAN (sans sign) for an argument to the exception. + // As an IRQ, the handler restores all registers, including $r3. + // NOTE: The service handler may not return. + lsrs r0, #1 + movs r3, #(UNORDERED_COMPARISON) + svc #(SVC_TRAP_NAN) + #endif + + LLSYM(__fcmp_return2): + // HACK: Work around result register mapping. + // This could probably be eliminated by remapping the flags register. + movs r3, r2 + + LLSYM(__fcmp_return): + // Finish setting up the result. + // Constant subtraction allows a negative result while keeping the + // $r2 flag control word within 8 bits, particularly for FCMP_UN*. + // This operation also happens to set the 'Z' and 'C' flags correctly + // per the requirements of __aeabi_cfcmple() et al. + subs r0, r3, #1 + RET + + CFI_END_FUNCTION +FUNC_END internal_cmpsf2 + #ifdef L_arm_cmpsf2 +FUNC_END ltsf2 +FUNC_END lesf2 +FUNC_END cmpsf2 +#endif + +#endif /* L_arm_cmpsf2 || L_internal_cmpsf2 */ + + +#ifdef L_arm_eqsf2 + +// int __eqsf2(float, float) +// +// Returns the three-way comparison result of $r0 with $r1: +// * -1 if ($r0 < $r1) +// * 0 if ($r0 == $r1) +// * +1 if ($r0 > $r1), or either argument is NAN +// Uses $r2, $r3, and $ip as scratch space. +// Same parent section as __cmpsf2() to keep tail call branch within range. +FUNC_START_SECTION eqsf2 .text.sorted.libgcc.fcmp.eqsf2 +FUNC_ALIAS nesf2 eqsf2 + CFI_START_FUNCTION + + // Assumption: The 'libgcc' functions should raise exceptions. + movs r2, #(FCMP_UN_POSITIVE + FCMP_NO_EXCEPTIONS + FCMP_3WAY) + b SYM(__internal_cmpsf2) + + CFI_END_FUNCTION +FUNC_END nesf2 +FUNC_END eqsf2 + +#endif /* L_arm_eqsf2 */ + + +#ifdef L_arm_gesf2 + +// int __gesf2(float, float) +// +// Returns the three-way comparison result of $r0 with $r1: +// * -1 if ($r0 < $r1), or either argument is NAN +// * 0 if ($r0 == $r1) +// * +1 if ($r0 > $r1) +// Uses $r2, $r3, and $ip as scratch space. +// Same parent section as __cmpsf2() to keep tail call branch within range. +FUNC_START_SECTION gesf2 .text.sorted.libgcc.fcmp.gesf2 +FUNC_ALIAS gtsf2 gesf2 + CFI_START_FUNCTION + + // Assumption: The 'libgcc' functions should raise exceptions. + movs r2, #(FCMP_UN_NEGATIVE + FCMP_RAISE_EXCEPTIONS + FCMP_3WAY) + b SYM(__internal_cmpsf2) + + CFI_END_FUNCTION +FUNC_END gtsf2 +FUNC_END gesf2 + +#endif /* L_arm_gesf2 */ + + +#ifdef L_arm_fcmpeq + +// int __aeabi_fcmpeq(float, float) +// Returns '1' in $r1 if ($r0 == $r1) (ordered). +// Uses $r2, $r3, and $ip as scratch space. +// Same parent section as __cmpsf2() to keep tail call branch within range. +FUNC_START_SECTION aeabi_fcmpeq .text.sorted.libgcc.fcmp.fcmpeq + CFI_START_FUNCTION + + movs r2, #(FCMP_UN_ZERO + FCMP_NO_EXCEPTIONS + FCMP_EQ) + b SYM(__internal_cmpsf2) + + CFI_END_FUNCTION +FUNC_END aeabi_fcmpeq + +#endif /* L_arm_fcmpeq */ + + +#ifdef L_arm_fcmpne + +// int __aeabi_fcmpne(float, float) [non-standard] +// Returns '1' in $r1 if ($r0 != $r1) (ordered). +// Uses $r2, $r3, and $ip as scratch space. +// Same parent section as __cmpsf2() to keep tail call branch within range. +FUNC_START_SECTION aeabi_fcmpne .text.sorted.libgcc.fcmp.fcmpne + CFI_START_FUNCTION + + movs r2, #(FCMP_UN_ZERO + FCMP_NO_EXCEPTIONS + FCMP_NE) + b SYM(__internal_cmpsf2) + + CFI_END_FUNCTION +FUNC_END aeabi_fcmpne + +#endif /* L_arm_fcmpne */ + + +#ifdef L_arm_fcmplt + +// int __aeabi_fcmplt(float, float) +// Returns '1' in $r1 if ($r0 < $r1) (ordered). +// Uses $r2, $r3, and $ip as scratch space. +// Same parent section as __cmpsf2() to keep tail call branch within range. +FUNC_START_SECTION aeabi_fcmplt .text.sorted.libgcc.fcmp.fcmplt + CFI_START_FUNCTION + + movs r2, #(FCMP_UN_ZERO + FCMP_RAISE_EXCEPTIONS + FCMP_LT) + b SYM(__internal_cmpsf2) + + CFI_END_FUNCTION +FUNC_END aeabi_fcmplt + +#endif /* L_arm_fcmplt */ + + +#ifdef L_arm_fcmple + +// int __aeabi_fcmple(float, float) +// Returns '1' in $r1 if ($r0 <= $r1) (ordered). +// Uses $r2, $r3, and $ip as scratch space. +// Same parent section as __cmpsf2() to keep tail call branch within range. +FUNC_START_SECTION aeabi_fcmple .text.sorted.libgcc.fcmp.fcmple + CFI_START_FUNCTION + + movs r2, #(FCMP_UN_ZERO + FCMP_RAISE_EXCEPTIONS + FCMP_LE) + b SYM(__internal_cmpsf2) + + CFI_END_FUNCTION +FUNC_END aeabi_fcmple + +#endif /* L_arm_fcmple */ + + +#ifdef L_arm_fcmpge + +// int __aeabi_fcmpge(float, float) +// Returns '1' in $r1 if ($r0 >= $r1) (ordered). +// Uses $r2, $r3, and $ip as scratch space. +// Same parent section as __cmpsf2() to keep tail call branch within range. +FUNC_START_SECTION aeabi_fcmpge .text.sorted.libgcc.fcmp.fcmpge + CFI_START_FUNCTION + + movs r2, #(FCMP_UN_ZERO + FCMP_RAISE_EXCEPTIONS + FCMP_GE) + b SYM(__internal_cmpsf2) + + CFI_END_FUNCTION +FUNC_END aeabi_fcmpge + +#endif /* L_arm_fcmpge */ + + +#ifdef L_arm_fcmpgt + +// int __aeabi_fcmpgt(float, float) +// Returns '1' in $r1 if ($r0 > $r1) (ordered). +// Uses $r2, $r3, and $ip as scratch space. +// Same parent section as __cmpsf2() to keep tail call branch within range. +FUNC_START_SECTION aeabi_fcmpgt .text.sorted.libgcc.fcmp.fcmpgt + CFI_START_FUNCTION + + movs r2, #(FCMP_UN_ZERO + FCMP_RAISE_EXCEPTIONS + FCMP_GT) + b SYM(__internal_cmpsf2) + + CFI_END_FUNCTION +FUNC_END aeabi_fcmpgt + +#endif /* L_arm_cmpgt */ + + +#ifdef L_arm_unordsf2 + +// int __aeabi_fcmpun(float, float) +// Returns '1' in $r1 if $r0 and $r1 are unordered. +// Uses $r2, $r3, and $ip as scratch space. +// Same parent section as __cmpsf2() to keep tail call branch within range. +FUNC_START_SECTION aeabi_fcmpun .text.sorted.libgcc.fcmp.fcmpun +FUNC_ALIAS unordsf2 aeabi_fcmpun + CFI_START_FUNCTION + + movs r2, #(FCMP_UN_POSITIVE + FCMP_NO_EXCEPTIONS) + b SYM(__internal_cmpsf2) + + CFI_END_FUNCTION +FUNC_END unordsf2 +FUNC_END aeabi_fcmpun + +#endif /* L_arm_unordsf2 */ + + +#if defined(L_arm_cfcmple) || defined(L_arm_cfrcmple) || \ + (defined(L_arm_cfcmpeq) && defined(TRAP_NANS) && TRAP_NANS) + +#if defined(L_arm_cfcmple) + #define CFCMPLE_NAME aeabi_cfcmple + #define CFCMPLE_SECTION .text.sorted.libgcc.fcmp.cfcmple +#elif defined(L_arm_cfrcmple) + #define CFCMPLE_NAME aeabi_cfrcmple + #define CFCMPLE_SECTION .text.sorted.libgcc.fcmp.cfrcmple +#else + #define CFCMPLE_NAME aeabi_cfcmpeq + #define CFCMPLE_SECTION .text.sorted.libgcc.fcmp.cfcmpeq +#endif + +// void __aeabi_cfcmple(float, float) +// void __aeabi_cfrcmple(float, float) +// void __aeabi_cfcmpeq(float, float) +// __aeabi_cfrcmple() first reverses the ordr of the input arguments. +// __aeabi_cfcmpeq() is an alias of __aeabi_cfcmple() if the library +// does not support signaling NAN exceptions. +// Three-way compare of $r0 ? $r1, with result in the status flags: +// * 'Z' is set only when the operands are ordered and equal. +// * 'C' is clear only when the operands are ordered and $r0 < $r1. +// Preserves all core registers except $ip, $lr, and the CPSR. +// Same parent section as __cmpsf2() to keep tail call branch within range. +FUNC_START_SECTION CFCMPLE_NAME CFCMPLE_SECTION + + // __aeabi_cfcmpeq() is defined separately when TRAP_NANS is enabled. + #if defined(L_arm_cfcmple) && !(defined(TRAP_NANS) && TRAP_NANS) + FUNC_ALIAS aeabi_cfcmpeq aeabi_cfcmple + #endif + + CFI_START_FUNCTION + + #if defined(DOUBLE_ALIGN_STACK) && DOUBLE_ALIGN_STACK + push { r0 - r3, rT, lr } + .cfi_remember_state + .cfi_adjust_cfa_offset 24 + .cfi_rel_offset r0, 0 + .cfi_rel_offset r1, 4 + .cfi_rel_offset r2, 8 + .cfi_rel_offset r3, 12 + .cfi_rel_offset rT, 16 + .cfi_rel_offset lr, 20 + #else + push { r0 - r3, lr } + .cfi_remember_state + .cfi_adjust_cfa_offset 20 + .cfi_rel_offset r0, 0 + .cfi_rel_offset r1, 4 + .cfi_rel_offset r2, 8 + .cfi_rel_offset r3, 12 + .cfi_rel_offset lr, 16 + #endif + + #ifdef L_arm_cfcmple + // Even though the result in $r0 will be discarded, the 3-way + // subtraction of '-1' that generates this result happens to + // set 'C' and 'Z' perfectly. Unordered results group with '>'. + // This happens to be the same control word as __cmpsf2(), meaning + // that __cmpsf2() is a potential direct branch target. However, + // the choice to set a redundant control word and branch to + // __internal_cmpsf2() makes this compiled object more robust + // against linking with 'foreign' __cmpsf2() implementations. + movs r2, #(FCMP_UN_POSITIVE + FCMP_RAISE_EXCEPTIONS + FCMP_3WAY) + #elif defined(L_arm_cfrcmple) + // Instead of reversing the order of the operands, it's slightly + // faster to inverted the result. But, for that to fully work, + // the sense of NAN must be pre-inverted. + movs r2, #(FCMP_UN_NEGATIVE + FCMP_NO_EXCEPTIONS + FCMP_3WAY) + #else /* L_arm_cfcmpeq */ + // Same as __aeabi_cfcmple(), except no exceptions on quiet NAN. + movs r2, #(FCMP_UN_POSITIVE + FCMP_NO_EXCEPTIONS + FCMP_3WAY) + #endif + + bl SYM(__internal_cmpsf2) + + #ifdef L_arm_cfrcmple + // Instead of reversing the order of the operands, it's slightly + // faster to inverted the result. Since __internal_cmpsf2() sets + // its flags by subtracing '1' from $r3, the reverse flags may be + // simply obtained subtracting $r3 from 1. + movs r1, #1 + subs r1, r3 + #endif /* L_arm_cfrcmple */ + + // Clean up all working registers. + #if defined(DOUBLE_ALIGN_STACK) && DOUBLE_ALIGN_STACK + pop { r0 - r3, rT, pc } + .cfi_restore_state + #else + pop { r0 - r3, pc } + .cfi_restore_state + #endif + + CFI_END_FUNCTION + + #if defined(L_arm_cfcmple) && !(defined(TRAP_NANS) && TRAP_NANS) + FUNC_END aeabi_cfcmpeq + #endif + +FUNC_END CFCMPLE_NAME + +#endif /* L_arm_cfcmple || L_arm_cfrcmple || L_arm_cfcmpeq */ + + +// C99 libm functions +#ifndef __GNUC__ + +// int isgreaterf(float, float) +// Returns '1' in $r0 if ($r0 > $r1) and both $r0 and $r1 are ordered. +// Uses $r2, $r3, and $ip as scratch space. +// Same parent section as __cmpsf2() to keep tail call branch within range. +FUNC_START_SECTION isgreaterf .text.sorted.libgcc.fcmp.isgtf +MATH_ALIAS isgreaterf + CFI_START_FUNCTION + + movs r2, #(FCMP_UN_ZERO + FCMP_NO_EXCEPTIONS + FCMP_GT) + b SYM(__internal_cmpsf2) + + CFI_END_FUNCTION +MATH_END isgreaterf +FUNC_END isgreaterf + + +// int isgreaterequalf(float, float) +// Returns '1' in $r0 if ($r0 >= $r1) and both $r0 and $r1 are ordered. +// Uses $r2, $r3, and $ip as scratch space. +// Same parent section as __cmpsf2() to keep tail call branch within range. +FUNC_START_SECTION isgreaterequalf .text.sorted.libgcc.fcmp.isgef +MATH_ALIAS isgreaterequalf + CFI_START_FUNCTION + + movs r2, #(FCMP_UN_ZERO + FCMP_NO_EXCEPTIONS + FCMP_GT + FCMP_EQ) + b SYM(__internal_cmpsf2) + + CFI_END_FUNCTION +MATH_END isgreaterequalf +FUNC_END isgreaterequalf + + +// int islessf(float, float) +// Returns '1' in $r0 if ($r0 < $r1) and both $r0 and $r1 are ordered. +// Uses $r2, $r3, and $ip as scratch space. +// Same parent section as __cmpsf2() to keep tail call branch within range. +FUNC_START_SECTION islessf .text.sorted.libgcc.fcmp.isltf +MATH_ALIAS islessf + CFI_START_FUNCTION + + movs r2, #(FCMP_UN_ZERO + FCMP_NO_EXCEPTIONS + FCMP_GT + FCMP_EQ) + b SYM(__internal_cmpsf2) + + CFI_END_FUNCTION +MATH_END islessf +FUNC_END islessf + + +// int islessequalf(float, float) +// Returns '1' in $r0 if ($r0 <= $r1) and both $r0 and $r1 are ordered. +// Uses $r2, $r3, and $ip as scratch space. +// Same parent section as __cmpsf2() to keep tail call branch within range. +FUNC_START_SECTION islessequalf .text.sorted.libgcc.fcmp.islef +MATH_ALIAS islessequalf + CFI_START_FUNCTION + + movs r2, #(FCMP_UN_ZERO + FCMP_NO_EXCEPTIONS + FCMP_GT + FCMP_EQ) + b SYM(__internal_cmpsf2) + + CFI_END_FUNCTION +MATH_END islessequalf +FUNC_END islessequalf + + +// int islessgreaterf(float, float) +// Returns '1' in $r0 if ($r0 != $r1) and both $r0 and $r1 are ordered. +// Uses $r2, $r3, and $ip as scratch space. +// Same parent section as __cmpsf2() to keep tail call branch within range. +FUNC_START_SECTION islessgreaterf .text.sorted.libgcc.fcmp.isnef +MATH_ALIAS islessgreaterf + CFI_START_FUNCTION + + movs r2, #(FCMP_UN_ZERO + FCMP_NO_EXCEPTIONS + FCMP_GT + FCMP_EQ) + b SYM(__internal_cmpsf2) + + CFI_END_FUNCTION +MATH_END islessgreaterf +FUNC_END islessgreaterf + + +// int isunorderedf(float, float) +// Returns '1' in $r0 if either $r0 or $r1 are ordered. +// Uses $r2, $r3, and $ip as scratch space. +// Same parent section as __cmpsf2() to keep tail call branch within range. +FUNC_START_SECTION isunorderedf .text.sorted.libgcc.fcmp.isunf +MATH_ALIAS isunorderedf + CFI_START_FUNCTION + + movs r2, #(FCMP_UN_ZERO + FCMP_NO_EXCEPTIONS + FCMP_GT + FCMP_EQ) + b SYM(__internal_cmpsf2) + + CFI_END_FUNCTION +MATH_END isunorderedf +FUNC_END isunorderedf -FUNC_START aeabi_cfrcmple - - mov ip, r0 - movs r0, r1 - mov r1, ip - b 6f - -FUNC_START aeabi_cfcmpeq -FUNC_ALIAS aeabi_cfcmple aeabi_cfcmpeq - - @ The status-returning routines are required to preserve all - @ registers except ip, lr, and cpsr. -6: push {r0, r1, r2, r3, r4, lr} - bl __lesf2 - @ Set the Z flag correctly, and the C flag unconditionally. - cmp r0, #0 - @ Clear the C flag if the return value was -1, indicating - @ that the first operand was smaller than the second. - bmi 1f - movs r1, #0 - cmn r0, r1 -1: - pop {r0, r1, r2, r3, r4, pc} - - FUNC_END aeabi_cfcmple - FUNC_END aeabi_cfcmpeq - FUNC_END aeabi_cfrcmple - -FUNC_START aeabi_fcmpeq - - push {r4, lr} - bl __eqsf2 - negs r0, r0 - adds r0, r0, #1 - pop {r4, pc} - - FUNC_END aeabi_fcmpeq - -.macro COMPARISON cond, helper, mode=sf2 -FUNC_START aeabi_fcmp\cond - - push {r4, lr} - bl __\helper\mode - cmp r0, #0 - b\cond 1f - movs r0, #0 - pop {r4, pc} -1: - movs r0, #1 - pop {r4, pc} - - FUNC_END aeabi_fcmp\cond -.endm - -COMPARISON lt, le -COMPARISON le, le -COMPARISON gt, ge -COMPARISON ge, ge - -#endif /* L_arm_cmpsf2 */ +#endif /* !__GNUC__ */ diff --git a/libgcc/config/arm/eabi/fplib.h b/libgcc/config/arm/eabi/fplib.h new file mode 100644 index 00000000000..ca22d3db8e3 --- /dev/null +++ b/libgcc/config/arm/eabi/fplib.h @@ -0,0 +1,83 @@ +/* fplib.h: Thumb-1 optimized floating point library configuration + + Copyright (C) 2018-2021 Free Software Foundation, Inc. + Contributed by Daniel Engel, Senva Inc (gnu@danielengel.com) + + This file is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the + Free Software Foundation; either version 3, or (at your option) any + later version. + + This file is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + + +#ifndef __FPLIB_H +#define __FPLIB_H + +/* Enable exception interrupt handler. + Exception implementation is opportunistic, and not fully tested. */ +#define TRAP_EXCEPTIONS (0) +#define EXCEPTION_CODES (0) + +/* Perform extra checks to avoid modifying the sign bit of NANs */ +#define STRICT_NANS (0) + +/* Trap signaling NANs regardless of context. */ +#define TRAP_NANS (0) + +/* TODO: Define service numbers according to the handler requirements */ +#define SVC_TRAP_NAN (0) +#define SVC_FP_EXCEPTION (0) +#define SVC_DIVISION_BY_ZERO (0) + +/* Push extra registers when required for 64-bit stack alignment */ +#define DOUBLE_ALIGN_STACK (1) + +/* Manipulate *div0() parameters to meet the ARM runtime ABI specification. */ +#define PEDANTIC_DIV0 (1) + +/* Define various exception codes. These don't map to anything in particular */ +#define SUBTRACTED_INFINITY (20) +#define INFINITY_TIMES_ZERO (21) +#define DIVISION_0_BY_0 (22) +#define DIVISION_INF_BY_INF (23) +#define UNORDERED_COMPARISON (24) +#define CAST_OVERFLOW (25) +#define CAST_INEXACT (26) +#define CAST_UNDEFINED (27) + +/* Exception control for quiet NANs. + If TRAP_NAN support is enabled, signaling NANs always raise exceptions. */ +#define FCMP_RAISE_EXCEPTIONS 16 +#define FCMP_NO_EXCEPTIONS 0 + +/* The bit indexes in these assignments are significant. See implementation. + They are shared publicly for eventual use by newlib. */ +#define FCMP_3WAY (1) +#define FCMP_LT (2) +#define FCMP_EQ (4) +#define FCMP_GT (8) + +#define FCMP_GE (FCMP_EQ | FCMP_GT) +#define FCMP_LE (FCMP_LT | FCMP_EQ) +#define FCMP_NE (FCMP_LT | FCMP_GT) + +/* These flags affect the result of unordered comparisons. See implementation. */ +#define FCMP_UN_THREE (128) +#define FCMP_UN_POSITIVE (64) +#define FCMP_UN_ZERO (32) +#define FCMP_UN_NEGATIVE (0) + +#endif /* __FPLIB_H */ diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S index 2f18eb68cba..236b7a7763f 100644 --- a/libgcc/config/arm/lib1funcs.S +++ b/libgcc/config/arm/lib1funcs.S @@ -2010,6 +2010,7 @@ LSYM(Lchange_\register): #include "bpabi.S" #else /* NOT_ISA_TARGET_32BIT */ #include "bpabi-v6m.S" +#include "eabi/fplib.h" #include "eabi/fcmp.S" #endif /* NOT_ISA_TARGET_32BIT */ #include "eabi/lcmp.S" diff --git a/libgcc/config/arm/t-elf b/libgcc/config/arm/t-elf index eb1acd8d5a2..e69579e16dd 100644 --- a/libgcc/config/arm/t-elf +++ b/libgcc/config/arm/t-elf @@ -30,6 +30,7 @@ LIB1ASMFUNCS += \ ifeq (__ARM_ARCH_ISA_THUMB 1,$(ARM_ISA)$(THUMB1_ISA)) # Group 0B: WEAK overridable function objects built for v6m only. LIB1ASMFUNCS += \ + _internal_cmpsf2 \ _muldi3 \ endif @@ -80,6 +81,23 @@ LIB1ASMFUNCS += \ _arm_negsf2 \ _arm_unordsf2 \ +ifeq (__ARM_ARCH_ISA_THUMB 1,$(ARM_ISA)$(THUMB1_ISA)) +# Group 2B: Single precision function objects built for v6m only. +LIB1ASMFUNCS += \ + _arm_cfcmpeq \ + _arm_cfcmple \ + _arm_cfrcmple \ + _arm_fcmpeq \ + _arm_fcmpge \ + _arm_fcmpgt \ + _arm_fcmple \ + _arm_fcmplt \ + _arm_fcmpne \ + _arm_eqsf2 \ + _arm_gesf2 \ + +endif + # Group 3: Double precision floating point function objects. LIB1ASMFUNCS += \ From patchwork Fri Jan 15 11:30:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426920 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=FvazjFev; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=Ud9pCJe3; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJvs3vRgz9sWk for ; Fri, 15 Jan 2021 22:33:13 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BEBF73982428; Fri, 15 Jan 2021 11:32:00 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id C04003982428 for ; Fri, 15 Jan 2021 11:31:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org C04003982428 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id CD841F7B; Fri, 15 Jan 2021 06:31:56 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 15 Jan 2021 06:31:57 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=pt0aW9hVg8upN uHNIWk4b+omqeY8/XAmhgYxkjWD23M=; b=FvazjFevtFudipxyJGKW2dKUIhoVi tllahBQYrStY5h9ogbmorWDL33A8hbBuG6DSYid1ZjfYzfFSENfoMWaNfh2IV/BW Tl2hdbEobg91Wr8kTlbcmRfVJP9gDieKwuER7SPzREobCkbKOTqhyIkQghrut5F4 uD7z+LMzvH0SKBlrb1r6Ia96/d/WUN1u4o5Siu9laqu0uSUIEm0b8ECItqpcxSFV Uhvl1W5NBbcKt+tFD2Cgf2zrYQkpMnxZqBjJyyvtZ+W1t+iPJRsGuujjI2JIf44O d3dubKukHWFXbOnWwEVcGsPGGMs2qR7aTc/19lnAKufMOaGzIRZjE6AWg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=pt0aW9hVg8upNuHNIWk4b+omqeY8/XAmhgYxkjWD23M=; b=Ud9pCJe3 7eg1p6S9cIu8cLChfG/dNEYFGQPpIfcuI43UdU/sXUpnxrxTVGPCSd/1TgtN/aFu f3BrzdrXXqv7bey1RbAv3OlfDlPw7aIBltKwJ8UiRwpqrWPhg0UjL/X8qFTjRuCA lC4cybotORbED900912Wpuk5YYXoFvvcAbwJJf2UaiEyl1E749tCUFevQ3NF15qS 9MZeHWhsVL8JA6TEMkJxDiVlpaLrgXCP3Izscf8AaI4PwOGJfQ88A5p+8D2Up9C4 UWnsE5HLgA2v+m4s56ZGI3IUDmw0R9Fy7xgEr9Djy1wdalkWvqtjJUxORBZVIKxp 4aK5bh2ThybKMg== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddtgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpeehtefhtdejheekgeehheefhfevgf fggfduhfeukeekveeiueehgffhjeetkeeiheenucffohhmrghinhepsghprggsihdqvhei mhdrshgspdhfrgguugdrshgspdhgnhhurdhorhhgpdhlihgsudhfuhhntghsrdhssgenuc fkphepjedurdefiedruddttddrvddvtdenucevlhhushhtvghrufhiiigvpedtnecurfgr rhgrmhepmhgrihhlfhhrohhmpehgnhhusegurghnihgvlhgvnhhgvghlrdgtohhm X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id 0D7CC108005B; Fri, 15 Jan 2021 06:31:55 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBVtt0023766; Fri, 15 Jan 2021 03:31:55 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 25/33] Refactor Thumb-1 float subtraction into a new file Date: Fri, 15 Jan 2021 03:30:53 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-13.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" This will make it easier to isolate changes in subsequent patches. gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/bpabi-v6m.S (__aeabi_frsub): Moved to ... * config/arm/eabi/fadd.S: New file. * config/arm/lib1funcs.S: #include eabi/fadd.S (v6m only). --- libgcc/config/arm/bpabi-v6m.S | 16 --------------- libgcc/config/arm/eabi/fadd.S | 38 +++++++++++++++++++++++++++++++++++ libgcc/config/arm/lib1funcs.S | 1 + 3 files changed, 39 insertions(+), 16 deletions(-) create mode 100644 libgcc/config/arm/eabi/fadd.S diff --git a/libgcc/config/arm/bpabi-v6m.S b/libgcc/config/arm/bpabi-v6m.S index 7c874f06218..c76c3b0568b 100644 --- a/libgcc/config/arm/bpabi-v6m.S +++ b/libgcc/config/arm/bpabi-v6m.S @@ -33,22 +33,6 @@ .eabi_attribute 25, 1 #endif /* __ARM_EABI__ */ - -#ifdef L_arm_addsubsf3 - -FUNC_START aeabi_frsub - - push {r4, lr} - movs r4, #1 - lsls r4, #31 - eors r0, r0, r4 - bl __aeabi_fadd - pop {r4, pc} - - FUNC_END aeabi_frsub - -#endif /* L_arm_addsubsf3 */ - #ifdef L_arm_addsubdf3 FUNC_START aeabi_drsub diff --git a/libgcc/config/arm/eabi/fadd.S b/libgcc/config/arm/eabi/fadd.S new file mode 100644 index 00000000000..fffbd91d1bc --- /dev/null +++ b/libgcc/config/arm/eabi/fadd.S @@ -0,0 +1,38 @@ +/* Copyright (C) 2006-2021 Free Software Foundation, Inc. + Contributed by CodeSourcery. + + This file is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the + Free Software Foundation; either version 3, or (at your option) any + later version. + + This file is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + + +#ifdef L_arm_addsubsf3 + +FUNC_START aeabi_frsub + + push {r4, lr} + movs r4, #1 + lsls r4, #31 + eors r0, r0, r4 + bl __aeabi_fadd + pop {r4, pc} + + FUNC_END aeabi_frsub + +#endif /* L_arm_addsubsf3 */ + diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S index 236b7a7763f..31132633f32 100644 --- a/libgcc/config/arm/lib1funcs.S +++ b/libgcc/config/arm/lib1funcs.S @@ -2012,6 +2012,7 @@ LSYM(Lchange_\register): #include "bpabi-v6m.S" #include "eabi/fplib.h" #include "eabi/fcmp.S" +#include "eabi/fadd.S" #endif /* NOT_ISA_TARGET_32BIT */ #include "eabi/lcmp.S" #endif /* !__symbian__ */ From patchwork Fri Jan 15 11:30:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426921 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=Swz0+GFc; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=UbRo6X5O; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJvx4rgNz9sWq for ; Fri, 15 Jan 2021 22:33:17 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C24E63985441; Fri, 15 Jan 2021 11:32:04 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id 3A6233982433 for ; Fri, 15 Jan 2021 11:32:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 3A6233982433 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 53D3FF7E; Fri, 15 Jan 2021 06:31:59 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 15 Jan 2021 06:31:59 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=yd+SAG1YXYWnn Fvse1+bnfUzEfXRITjoDSAPDfVBVvw=; b=Swz0+GFccAgxJuhizc+P5JanaFucj Q1Ha1e1S/A3ATyDxlmfNGWEYRsJOXz2L++l+Uz8gdhjbtCcToMUilE9n7YsaO8DX CpnIZRPCf4T6EMB0Upbpv8OMeEik81IfZ9Yx6JySkuS0uuehbkpUesVzuuP4Pbh3 VIB5fd/N0tahC9HACA03izwUQZdbd10u7Az1uI2Krjz9IZajO+zzhkMKlatGuPr1 XLqVz3jo5xhlFqZM3Vg8UFN90PXGOmmscRxOuw229eiffamyBGALgTNowt+nnQ56 poQ+lxG/di09W4fr6v7MhhalKIcoE+3xuToYdvdtAovqa1BAbyISjWMfw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=yd+SAG1YXYWnnFvse1+bnfUzEfXRITjoDSAPDfVBVvw=; b=UbRo6X5O ER7v9OP0obmW3sCQKiq2PbefNgmtxy685uxaCCr6qEQrdcbirl/P54XykLAZMG3X BSQ27AGHzZyRFJNQ1rkArffFdNecVjcS0wX9RffIUWqgkZKmadPDimtqDsy/zl74 F0ASvHU64lCBbl6QQnOREE9MI7VNHyIqHZlo03BzjzCcGk+g2M1sXXhpb/5fJyPD Aq0Q/5XXxSIQvY7smgJ9+UdIUt6giq6JlfOZKx4C0yUXjoqcdbRSJ4IJL/kDrmOH I+uWdur6vysRNAETkOniQMR1xmO+Y3sbrjliKmCc22IqgJB5zVG+Kvto81o8hul1 PFusp3R745I6MQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddthecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpeefudduffejjeehieetledtfeetle etieevkefhgeejgeffkeeviefggeekleejvdenucffohhmrghinhepfhgruggurdhssgdp ghhnuhdrohhrghdpfhhnvghgrdhssgdpfhhuthhilhdrshgspdhlihgsudhfuhhntghsrd hssgenucfkphepjedurdefiedruddttddrvddvtdenucevlhhushhtvghrufhiiigvpedt necurfgrrhgrmhepmhgrihhlfhhrohhmpehgnhhusegurghnihgvlhgvnhhgvghlrdgtoh hm X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id 319131080066; Fri, 15 Jan 2021 06:31:58 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBVvEM023769; Fri, 15 Jan 2021 03:31:57 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 26/33] Import float addition and subtraction from the CM0 library Date: Fri, 15 Jan 2021 03:30:54 -0800 Message-Id: <9860b254bd2bb0cecc49ad89be9cb6f8cc2246be.1610709584.git.gnu@danielengel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-13.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SCC_5_SHORT_WORD_LINES, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Since this is the first import of single-precision functions, some common parsing and formatting routines are also included. These common rotines will be referenced by other functions in subsequent commits. However, even if the size penalty is accounted entirely to __addsf3(), the total compiled size is still less than half the size of soft-float. gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/eabi/fadd.S (__addsf3, __subsf3): Added new functions. * config/arm/eabi/fneg.S (__negsf2): Added new file. * config/arm/eabi/futil.S (__fp_normalize2, __fp_lalign2, __fp_assemble, __fp_overflow, __fp_zero, __fp_check_nan): Added new file with shared helper functions. * config/arm/lib1funcs.S: #include eabi/fneg.S and eabi/futil.S (v6m only). * config/arm/t-elf (LIB1ASMFUNCS): Added _arm_addsf3, _arm_frsubsf3, _fp_exceptionf, _fp_checknanf, _fp_assemblef, and _fp_normalizef. --- libgcc/config/arm/eabi/fadd.S | 306 +++++++++++++++++++++++- libgcc/config/arm/eabi/fneg.S | 76 ++++++ libgcc/config/arm/eabi/fplib.h | 3 - libgcc/config/arm/eabi/futil.S | 418 +++++++++++++++++++++++++++++++++ libgcc/config/arm/lib1funcs.S | 2 + libgcc/config/arm/t-elf | 6 + 6 files changed, 798 insertions(+), 13 deletions(-) create mode 100644 libgcc/config/arm/eabi/fneg.S create mode 100644 libgcc/config/arm/eabi/futil.S diff --git a/libgcc/config/arm/eabi/fadd.S b/libgcc/config/arm/eabi/fadd.S index fffbd91d1bc..77b81d62b3b 100644 --- a/libgcc/config/arm/eabi/fadd.S +++ b/libgcc/config/arm/eabi/fadd.S @@ -1,5 +1,7 @@ -/* Copyright (C) 2006-2021 Free Software Foundation, Inc. - Contributed by CodeSourcery. +/* fadd.S: Thumb-1 optimized 32-bit float addition and subtraction + + Copyright (C) 2018-2021 Free Software Foundation, Inc. + Contributed by Daniel Engel, Senva Inc (gnu@danielengel.com) This file is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the @@ -21,18 +23,302 @@ . */ +#ifdef L_arm_frsubsf3 + +// float __aeabi_frsub(float, float) +// Returns the floating point difference of $r1 - $r0 in $r0. +// Subsection ordering within fpcore keeps conditional branches within range. +FUNC_START_SECTION aeabi_frsub .text.sorted.libgcc.fpcore.b.frsub + CFI_START_FUNCTION + + #if defined(STRICT_NANS) && STRICT_NANS + // Check if $r0 is NAN before modifying. + lsls r2, r0, #1 + movs r3, #255 + lsls r3, #24 + + // Let fadd() find the NAN in the normal course of operation, + // moving it to $r0 and checking the quiet/signaling bit. + cmp r2, r3 + bhi SYM(__aeabi_fadd) + #endif + + // Flip sign and run through fadd(). + movs r2, #1 + lsls r2, #31 + adds r0, r2 + b SYM(__aeabi_fadd) + + CFI_END_FUNCTION +FUNC_END aeabi_frsub + +#endif /* L_arm_frsubsf3 */ + + #ifdef L_arm_addsubsf3 -FUNC_START aeabi_frsub +// float __aeabi_fsub(float, float) +// Returns the floating point difference of $r0 - $r1 in $r0. +// Subsection ordering within fpcore keeps conditional branches within range. +FUNC_START_SECTION aeabi_fsub .text.sorted.libgcc.fpcore.c.faddsub +FUNC_ALIAS subsf3 aeabi_fsub + CFI_START_FUNCTION - push {r4, lr} - movs r4, #1 - lsls r4, #31 - eors r0, r0, r4 - bl __aeabi_fadd - pop {r4, pc} + #if defined(STRICT_NANS) && STRICT_NANS + // Check if $r1 is NAN before modifying. + lsls r2, r1, #1 + movs r3, #255 + lsls r3, #24 - FUNC_END aeabi_frsub + // Let fadd() find the NAN in the normal course of operation, + // moving it to $r0 and checking the quiet/signaling bit. + cmp r2, r3 + bhi SYM(__aeabi_fadd) + #endif + + // Flip sign and fall into fadd(). + movs r2, #1 + lsls r2, #31 + adds r1, r2 #endif /* L_arm_addsubsf3 */ + +// The execution of __subsf3() flows directly into __addsf3(), such that +// instructions must appear consecutively in the same memory section. +// However, this construction inhibits the ability to discard __subsf3() +// when only using __addsf3(). +// Therefore, this block configures __addsf3() for compilation twice. +// The first version is a minimal standalone implementation, and the second +// version is the continuation of __subsf3(). The standalone version must +// be declared WEAK, so that the combined version can supersede it and +// provide both symbols when required. +// '_arm_addsf3' should appear before '_arm_addsubsf3' in LIB1ASMFUNCS. +#if defined(L_arm_addsf3) || defined(L_arm_addsubsf3) + +#ifdef L_arm_addsf3 +// float __aeabi_fadd(float, float) +// Returns the floating point sum of $r0 + $r1 in $r0. +// Subsection ordering within fpcore keeps conditional branches within range. +WEAK_START_SECTION aeabi_fadd .text.sorted.libgcc.fpcore.c.fadd +WEAK_ALIAS addsf3 aeabi_fadd + CFI_START_FUNCTION + +#else /* L_arm_addsubsf3 */ +FUNC_ENTRY aeabi_fadd +FUNC_ALIAS addsf3 aeabi_fadd + +#endif + + // Standard registers, compatible with exception handling. + push { rT, lr } + .cfi_remember_state + .cfi_remember_state + .cfi_adjust_cfa_offset 8 + .cfi_rel_offset rT, 0 + .cfi_rel_offset lr, 4 + + // Drop the sign bit to compare absolute value. + lsls r2, r0, #1 + lsls r3, r1, #1 + + // Save the logical difference of original values. + // This actually makes the following swap slightly faster. + eors r1, r0 + + // Compare exponents+mantissa. + // MAYBE: Speedup for equal values? This would have to separately + // check for NAN/INF and then either: + // * Increase the exponent by '1' (for multiply by 2), or + // * Return +0 + cmp r2, r3 + bhs LLSYM(__fadd_ordered) + + // Reorder operands so the larger absolute value is in r2, + // the corresponding original operand is in $r0, + // and the smaller absolute value is in $r3. + movs r3, r2 + eors r0, r1 + lsls r2, r0, #1 + + LLSYM(__fadd_ordered): + // Extract the exponent of the larger operand. + // If INF/NAN, then it becomes an automatic result. + lsrs r2, #24 + cmp r2, #255 + beq LLSYM(__fadd_special) + + // Save the sign of the result. + lsrs rT, r0, #31 + lsls rT, #31 + mov ip, rT + + // If the original value of $r1 was to +/-0, + // $r0 becomes the automatic result. + // Because $r0 is known to be a finite value, return directly. + // It's actually important that +/-0 not go through the normal + // process, to keep "-0 +/- 0" from being turned into +0. + cmp r3, #0 + beq LLSYM(__fadd_zero) + + // Extract the second exponent. + lsrs r3, #24 + + // Calculate the difference of exponents (always positive). + subs r3, r2, r3 + + #if !defined(__OPTIMIZE_SIZE__) || !__OPTIMIZE_SIZE__ + // If the smaller operand is more than 25 bits less significant + // than the larger, the larger operand is an automatic result. + // The smaller operand can't affect the result, even after rounding. + cmp r3, #25 + bhi LLSYM(__fadd_return) + #endif + + // Isolate both mantissas, recovering the smaller. + lsls rT, r0, #9 + lsls r0, r1, #9 + eors r0, rT + + // If the larger operand is normal, restore the implicit '1'. + // If subnormal, the second operand will also be subnormal. + cmp r2, #0 + beq LLSYM(__fadd_normal) + adds rT, #1 + rors rT, rT + + // If the smaller operand is also normal, restore the implicit '1'. + // If subnormal, the smaller operand effectively remains multiplied + // by 2 w.r.t the first. This compensates for subnormal exponents, + // which are technically still -126, not -127. + cmp r2, r3 + beq LLSYM(__fadd_normal) + adds r0, #1 + rors r0, r0 + + LLSYM(__fadd_normal): + // Provide a spare bit for overflow. + // Normal values will be aligned in bits [30:7] + // Subnormal values will be aligned in bits [30:8] + lsrs rT, #1 + lsrs r0, #1 + + // If signs weren't matched, negate the smaller operand (branchless). + asrs r1, #31 + eors r0, r1 + subs r0, r1 + + // Keep a copy of the small mantissa for the remainder. + movs r1, r0 + + // Align the small mantissa for addition. + asrs r1, r3 + + // Isolate the remainder. + // NOTE: Given the various cases above, the remainder will only + // be used as a boolean for rounding ties to even. It is not + // necessary to negate the remainder for subtraction operations. + rsbs r3, #0 + adds r3, #32 + lsls r0, r3 + + // Because operands are ordered, the result will never be negative. + // If the result of subtraction is 0, the overall result must be +0. + // If the overall result in $r1 is 0, then the remainder in $r0 + // must also be 0, so no register copy is necessary on return. + adds r1, rT + beq LLSYM(__fadd_return) + + // The large operand was aligned in bits [29:7]... + // If the larger operand was normal, the implicit '1' went in bit [30]. + // + // After addition, the MSB of the result may be in bit: + // 31, if the result overflowed. + // 30, the usual case. + // 29, if there was a subtraction of operands with exponents + // differing by more than 1. + // < 28, if there was a subtraction of operands with exponents +/-1, + // < 28, if both operands were subnormal. + + // In the last case (both subnormal), the alignment shift will be 8, + // the exponent will be 0, and no rounding is necessary. + cmp r2, #0 + bne SYM(__fp_assemble) + + // Subnormal overflow automatically forms the correct exponent. + lsrs r0, r1, #8 + add r0, ip + + LLSYM(__fadd_return): + pop { rT, pc } + .cfi_restore_state + + LLSYM(__fadd_special): + #if defined(TRAP_NANS) && TRAP_NANS + // If $r1 is (also) NAN, force it in place of $r0. + // As the smaller NAN, it is more likely to be signaling. + movs rT, #255 + lsls rT, #24 + cmp r3, rT + bls LLSYM(__fadd_ordered2) + + eors r0, r1 + #endif + + LLSYM(__fadd_ordered2): + // There are several possible cases to consider here: + // 1. Any NAN/NAN combination + // 2. Any NAN/INF combination + // 3. Any NAN/value combination + // 4. INF/INF with matching signs + // 5. INF/INF with mismatched signs. + // 6. Any INF/value combination. + // In all cases but the case 5, it is safe to return $r0. + // In the special case, a new NAN must be constructed. + // First, check the mantissa to see if $r0 is NAN. + lsls r2, r0, #9 + + #if defined(TRAP_NANS) && TRAP_NANS + bne SYM(__fp_check_nan) + #else + bne LLSYM(__fadd_return) + #endif + + LLSYM(__fadd_zero): + // Next, check for an INF/value combination. + lsls r2, r1, #1 + bne LLSYM(__fadd_return) + + // Finally, check for matching sign on INF/INF. + // Also accepts matching signs when +/-0 are added. + bcc LLSYM(__fadd_return) + + #if defined(EXCEPTION_CODES) && EXCEPTION_CODES + movs r3, #(SUBTRACTED_INFINITY) + #endif + + #if defined(TRAP_EXCEPTIONS) && TRAP_EXCEPTIONS + // Restore original operands. + eors r1, r0 + #endif + + // Identify mismatched 0. + lsls r2, r0, #1 + bne SYM(__fp_exception) + + // Force mismatched 0 to +0. + eors r0, r0 + pop { rT, pc } + .cfi_restore_state + + CFI_END_FUNCTION +FUNC_END addsf3 +FUNC_END aeabi_fadd + +#ifdef L_arm_addsubsf3 +FUNC_END subsf3 +FUNC_END aeabi_fsub +#endif + +#endif /* L_arm_addsf3 */ + diff --git a/libgcc/config/arm/eabi/fneg.S b/libgcc/config/arm/eabi/fneg.S new file mode 100644 index 00000000000..b92e247c3d3 --- /dev/null +++ b/libgcc/config/arm/eabi/fneg.S @@ -0,0 +1,76 @@ +/* fneg.S: Thumb-1 optimized 32-bit float negation + + Copyright (C) 2018-2021 Free Software Foundation, Inc. + Contributed by Daniel Engel, Senva Inc (gnu@danielengel.com) + + This file is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the + Free Software Foundation; either version 3, or (at your option) any + later version. + + This file is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + + +#ifdef L_arm_negsf2 + +// float __aeabi_fneg(float) [obsolete] +// The argument and result are in $r0. +// Uses $r1 and $r2 as scratch registers. +// Subsection ordering within fpcore keeps conditional branches within range. +FUNC_START_SECTION aeabi_fneg .text.sorted.libgcc.fpcore.a.fneg +FUNC_ALIAS negsf2 aeabi_fneg + CFI_START_FUNCTION + + #if (defined(STRICT_NANS) && STRICT_NANS) || \ + (defined(TRAP_NANS) && TRAP_NANS) + // Check for NAN. + lsls r1, r0, #1 + movs r2, #255 + lsls r2, #24 + cmp r1, r2 + + #if defined(TRAP_NANS) && TRAP_NANS + blo SYM(__fneg_nan) + #else + blo LLSYM(__fneg_return) + #endif + #endif + + // Flip the sign. + movs r1, #1 + lsls r1, #31 + eors r0, r1 + + LLSYM(__fneg_return): + RET + + #if defined(TRAP_NANS) && TRAP_NANS + LLSYM(__fneg_nan): + // Set up registers for exception handling. + push { rT, lr } + .cfi_remember_state + .cfi_adjust_cfa_offset 8 + .cfi_rel_offset rT, 0 + .cfi_rel_offset lr, 4 + + b SYM(fp_check_nan) + #endif + + CFI_END_FUNCTION +FUNC_END negsf2 +FUNC_END aeabi_fneg + +#endif /* L_arm_negsf2 */ + diff --git a/libgcc/config/arm/eabi/fplib.h b/libgcc/config/arm/eabi/fplib.h index ca22d3db8e3..c1924a37ab3 100644 --- a/libgcc/config/arm/eabi/fplib.h +++ b/libgcc/config/arm/eabi/fplib.h @@ -45,9 +45,6 @@ /* Push extra registers when required for 64-bit stack alignment */ #define DOUBLE_ALIGN_STACK (1) -/* Manipulate *div0() parameters to meet the ARM runtime ABI specification. */ -#define PEDANTIC_DIV0 (1) - /* Define various exception codes. These don't map to anything in particular */ #define SUBTRACTED_INFINITY (20) #define INFINITY_TIMES_ZERO (21) diff --git a/libgcc/config/arm/eabi/futil.S b/libgcc/config/arm/eabi/futil.S new file mode 100644 index 00000000000..923c1c28b41 --- /dev/null +++ b/libgcc/config/arm/eabi/futil.S @@ -0,0 +1,418 @@ +/* futil.S: Thumb-1 optimized 32-bit float helper functions + + Copyright (C) 2018-2021 Free Software Foundation, Inc. + Contributed by Daniel Engel, Senva Inc (gnu@danielengel.com) + + This file is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the + Free Software Foundation; either version 3, or (at your option) any + later version. + + This file is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + + +// These helper functions are exported in distinct object files to keep +// the linker from importing unused code. +// These helper functions do NOT follow AAPCS register conventions. + + +#ifdef L_fp_normalizef + +// Internal function, decomposes the unsigned float in $r2. +// The exponent will be returned in $r2, the mantissa in $r3. +// If subnormal, the mantissa will be normalized, so that +// the MSB of the mantissa (if any) will be aligned at bit[31]. +// Preserves $r0 and $r1, uses $rT as scratch space. +FUNC_START_SECTION fp_normalize2 .text.sorted.libgcc.fpcore.y.alignf + CFI_START_FUNCTION + + // Extract the mantissa. + lsls r3, r2, #8 + + // Extract the exponent. + lsrs r2, #24 + beq SYM(__fp_lalign2) + + // Restore the mantissa's implicit '1'. + adds r3, #1 + rors r3, r3 + + RET + + CFI_END_FUNCTION +FUNC_END fp_normalize2 + + +// Internal function, aligns $r3 so the MSB is aligned in bit[31]. +// Simultaneously, subtracts the shift from the exponent in $r2 +FUNC_ENTRY fp_lalign2 + CFI_START_FUNCTION + + #if !defined(__OPTIMIZE_SIZE__) || !__OPTIMIZE_SIZE__ + // Unroll the loop, similar to __clzsi2(). + lsrs rT, r3, #16 + bne LLSYM(__align8) + subs r2, #16 + lsls r3, #16 + + LLSYM(__align8): + lsrs rT, r3, #24 + bne LLSYM(__align4) + subs r2, #8 + lsls r3, #8 + + LLSYM(__align4): + lsrs rT, r3, #28 + bne LLSYM(__align2) + subs r2, #4 + lsls r3, #4 + #endif + + LLSYM(__align2): + // Refresh the state of the N flag before entering the loop. + tst r3, r3 + + LLSYM(__align_loop): + // Test before subtracting to compensate for the natural exponent. + // The largest subnormal should have an exponent of 0, not -1. + bmi LLSYM(__align_return) + subs r2, #1 + lsls r3, #1 + bne LLSYM(__align_loop) + + // Not just a subnormal... 0! By design, this should never happen. + // All callers of this internal function filter 0 as a special case. + // Was there an uncontrolled jump from somewhere else? Cosmic ray? + eors r2, r2 + + #ifdef DEBUG + bkpt #0 + #endif + + LLSYM(__align_return): + RET + + CFI_END_FUNCTION +FUNC_END fp_lalign2 + +#endif /* L_fp_normalizef */ + + +#ifdef L_fp_assemblef + +// Internal function to combine mantissa, exponent, and sign. No return. +// Expects the unsigned result in $r1. To avoid underflow (slower), +// the MSB should be in bits [31:29]. +// Expects any remainder bits of the unrounded result in $r0. +// Expects the exponent in $r2. The exponent must be relative to bit[30]. +// Expects the sign of the result (and only the sign) in $ip. +// Returns a correctly rounded floating value in $r0. +// Subsection ordering within fpcore keeps conditional branches within range. +FUNC_START_SECTION fp_assemble .text.sorted.libgcc.fpcore.g.assemblef + CFI_START_FUNCTION + + // Work around CFI branching limitations. + .cfi_remember_state + .cfi_adjust_cfa_offset 8 + .cfi_rel_offset rT, 0 + .cfi_rel_offset lr, 4 + + // Examine the upper three bits [31:29] for underflow. + lsrs r3, r1, #29 + beq LLSYM(__fp_underflow) + + // Convert bits [31:29] into an offset in the range of { 0, -1, -2 }. + // Right rotation aligns the MSB in bit [31], filling any LSBs with '0'. + lsrs r3, r1, #1 + mvns r3, r3 + ands r3, r1 + lsrs r3, #30 + subs r3, #2 + rors r1, r3 + + // Update the exponent, assuming the final result will be normal. + // The new exponent is 1 less than actual, to compensate for the + // eventual addition of the implicit '1' in the result. + // If the final exponent becomes negative, proceed directly to gradual + // underflow, without bothering to search for the MSB. + adds r2, r3 + + FUNC_ENTRY fp_assemble2 + bmi LLSYM(__fp_subnormal) + + LLSYM(__fp_normal): + // Check for overflow (remember the implicit '1' to be added later). + cmp r2, #254 + bge SYM(__fp_overflow) + + // Save LSBs for the remainder. Position doesn't matter any more, + // these are just tiebreakers for round-to-even. + lsls rT, r1, #25 + + // Align the final result. + lsrs r1, #8 + + LLSYM(__fp_round): + // If carry bit is '0', always round down. + bcc LLSYM(__fp_return) + + // The carry bit is '1'. Round to nearest, ties to even. + // If either the saved remainder bits [6:0], the additional remainder + // bits in $r1, or the final LSB is '1', round up. + lsls r3, r1, #31 + orrs r3, rT + orrs r3, r0 + beq LLSYM(__fp_return) + + // If rounding up overflows, then the mantissa result becomes 2.0, + // which yields the correct return value up to and including INF. + adds r1, #1 + + LLSYM(__fp_return): + // Combine the mantissa and the exponent. + lsls r2, #23 + adds r0, r1, r2 + + // Combine with the saved sign. + // End of library call, return to user. + add r0, ip + + #if defined(FP_EXCEPTIONS) && FP_EXCEPTIONS + // TODO: Underflow/inexact reporting IFF remainder + #endif + + pop { rT, pc } + .cfi_restore_state + + LLSYM(__fp_underflow): + // Set up to align the mantissa. + movs r3, r1 + bne LLSYM(__fp_underflow2) + + // MSB wasn't in the upper 32 bits, check the remainder. + // If the remainder is also zero, the result is +/-0. + movs r3, r0 + beq SYM(__fp_zero) + + eors r0, r0 + subs r2, #32 + + LLSYM(__fp_underflow2): + // Save the pre-alignment exponent to align the remainder later. + movs r1, r2 + + // Align the mantissa with the MSB in bit[31]. + bl SYM(__fp_lalign2) + + // Calculate the actual remainder shift. + subs rT, r1, r2 + + // Align the lower bits of the remainder. + movs r1, r0 + lsls r0, rT + + // Combine the upper bits of the remainder with the aligned value. + rsbs rT, #0 + adds rT, #32 + lsrs r1, rT + adds r1, r3 + + // The MSB is now aligned at bit[31] of $r1. + // If the net exponent is still positive, the result will be normal. + // Because this function is used by fmul(), there is a possibility + // that the value is still wider than 24 bits; always round. + tst r2, r2 + bpl LLSYM(__fp_normal) + + LLSYM(__fp_subnormal): + // The MSB is aligned at bit[31], with a net negative exponent. + // The mantissa will need to be shifted right by the absolute value of + // the exponent, plus the normal shift of 8. + + // If the negative shift is smaller than -25, there is no result, + // no rounding, no anything. Return signed zero. + // (Otherwise, the shift for result and remainder may wrap.) + adds r2, #25 + bmi SYM(__fp_inexact_zero) + + // Save the extra bits for the remainder. + movs rT, r1 + lsls rT, r2 + + // Shift the mantissa to create a subnormal. + // Just like normal, round to nearest, ties to even. + movs r3, #33 + subs r3, r2 + eors r2, r2 + + // This shift must be last, leaving the shifted LSB in the C flag. + lsrs r1, r3 + b LLSYM(__fp_round) + + CFI_END_FUNCTION +FUNC_END fp_assemble + + +// Recreate INF with the appropriate sign. No return. +// Expects the sign of the result in $ip. +FUNC_ENTRY fp_overflow + CFI_START_FUNCTION + + #if defined(FP_EXCEPTIONS) && FP_EXCEPTIONS + // TODO: inexact/overflow exception + #endif + + FUNC_ENTRY fp_infinity + + // Work around CFI branching limitations. + .cfi_remember_state + .cfi_adjust_cfa_offset 8 + .cfi_rel_offset rT, 0 + .cfi_rel_offset lr, 4 + + movs r0, #255 + lsls r0, #23 + add r0, ip + pop { rT, pc } + .cfi_restore_state + + CFI_END_FUNCTION +FUNC_END fp_overflow + + +// Recreate 0 with the appropriate sign. No return. +// Expects the sign of the result in $ip. +FUNC_ENTRY fp_inexact_zero + CFI_START_FUNCTION + + #if defined(FP_EXCEPTIONS) && FP_EXCEPTIONS + // TODO: inexact/underflow exception + #endif + +FUNC_ENTRY fp_zero + + // Work around CFI branching limitations. + .cfi_remember_state + .cfi_adjust_cfa_offset 8 + .cfi_rel_offset rT, 0 + .cfi_rel_offset lr, 4 + + // Return 0 with the correct sign. + mov r0, ip + pop { rT, pc } + .cfi_restore_state + + CFI_END_FUNCTION +FUNC_END fp_zero +FUNC_END fp_inexact_zero + +#endif /* L_fp_assemblef */ + + +#ifdef L_fp_checknanf + +// Internal function to detect signaling NANs. No return. +// Uses $r2 as scratch space. +// Subsection ordering within fpcore keeps conditional branches within range. +FUNC_START_SECTION fp_check_nan2 .text.sorted.libgcc.fpcore.j.checkf + CFI_START_FUNCTION + + // Work around CFI branching limitations. + .cfi_remember_state + .cfi_adjust_cfa_offset 8 + .cfi_rel_offset rT, 0 + .cfi_rel_offset lr, 4 + + + FUNC_ENTRY fp_check_nan + + // Check for quiet NAN. + lsrs r2, r0, #23 + bcs LLSYM(__quiet_nan) + + // Raise exception. Preserves both $r0 and $r1. + svc #(SVC_TRAP_NAN) + + // Quiet the resulting NAN. + movs r2, #1 + lsls r2, #22 + orrs r0, r2 + + LLSYM(__quiet_nan): + // End of library call, return to user. + pop { rT, pc } + .cfi_restore_state + + CFI_END_FUNCTION +FUNC_END fp_check_nan +FUNC_END fp_check_nan2 + +#endif /* L_fp_checknanf */ + + +#ifdef L_fp_exceptionf + +// Internal function to report floating point exceptions. No return. +// Expects the original argument(s) in $r0 (possibly also $r1). +// Expects a code that describes the exception in $r3. +// Subsection ordering within fpcore keeps conditional branches within range. +FUNC_START_SECTION fp_exception .text.sorted.libgcc.fpcore.k.exceptf + CFI_START_FUNCTION + + // Work around CFI branching limitations. + .cfi_remember_state + .cfi_adjust_cfa_offset 8 + .cfi_rel_offset rT, 0 + .cfi_rel_offset lr, 4 + + // Create a quiet NAN. + movs r2, #255 + lsls r2, #1 + adds r2, #1 + lsls r2, #22 + + #if defined(EXCEPTION_CODES) && EXCEPTION_CODES + // Annotate the exception type in the NAN field. + // Make sure that the exception is in the valid region + lsls rT, r3, #13 + orrs r2, rT + #endif + + // Exception handler that expects the result already in $r2, + // typically when the result is not going to be NAN. + FUNC_ENTRY fp_exception2 + + #if defined(TRAP_EXCEPTIONS) && TRAP_EXCEPTIONS + svc #(SVC_FP_EXCEPTION) + #endif + + // TODO: Save exception flags in a static variable. + + // Set up the result, now that the argument isn't required any more. + movs r0, r2 + + // HACK: for sincosf(), with 2 parameters to return. + movs r1, r2 + + // End of library call, return to user. + pop { rT, pc } + .cfi_restore_state + + CFI_END_FUNCTION +FUNC_END fp_exception2 +FUNC_END fp_exception + +#endif /* L_arm_exception */ + diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S index 31132633f32..6c3f29b71e2 100644 --- a/libgcc/config/arm/lib1funcs.S +++ b/libgcc/config/arm/lib1funcs.S @@ -2012,7 +2012,9 @@ LSYM(Lchange_\register): #include "bpabi-v6m.S" #include "eabi/fplib.h" #include "eabi/fcmp.S" +#include "eabi/fneg.S" #include "eabi/fadd.S" +#include "eabi/futil.S" #endif /* NOT_ISA_TARGET_32BIT */ #include "eabi/lcmp.S" #endif /* !__symbian__ */ diff --git a/libgcc/config/arm/t-elf b/libgcc/config/arm/t-elf index e69579e16dd..c57d9ef50ac 100644 --- a/libgcc/config/arm/t-elf +++ b/libgcc/config/arm/t-elf @@ -32,6 +32,7 @@ ifeq (__ARM_ARCH_ISA_THUMB 1,$(ARM_ISA)$(THUMB1_ISA)) LIB1ASMFUNCS += \ _internal_cmpsf2 \ _muldi3 \ + _arm_addsf3 \ endif @@ -95,6 +96,11 @@ LIB1ASMFUNCS += \ _arm_fcmpne \ _arm_eqsf2 \ _arm_gesf2 \ + _arm_frsubsf3 \ + _fp_exceptionf \ + _fp_checknanf \ + _fp_assemblef \ + _fp_normalizef \ endif From patchwork Fri Jan 15 11:30:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426922 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=RbPdkSR9; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=U5WiL5Xa; dkim-atps=neutral Received: from sourceware.org (unknown [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJw22vCJz9sWq for ; Fri, 15 Jan 2021 22:33:22 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CDE5A398544A; Fri, 15 Jan 2021 11:32:06 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id 19F12398242D for ; Fri, 15 Jan 2021 11:32:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 19F12398242D Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 2ED41EBC; Fri, 15 Jan 2021 06:32:01 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 15 Jan 2021 06:32:01 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=GWLu28eBPALCf 2r714Mtzvs71ZxGYTPDl8WxxwCHknM=; b=RbPdkSR9zG1VWW1NB/+lto8VvN6RH PG2KILKJp0EGq2GSQMKruBAjyLA9sXGhN8+uTki6k6Lgsea+EiyDlbm5oM9LaOVy mZnCHwUYEJcJtqur2QjYxLCGLLn+ap7zFfoxGIqHyNCSf2bm1txajFb0A58bcPPV OwiwgqVTjuUR9fiGjDLZumHAQEqTEkito21923DdFCVEpMgpWCfgh3Vuwbbx237K Egx1pxP6LPW/OibjKG8fWgtovGwaMJ1Id+WsnVE6L1jxH99Dh3HoTiVj21lO7VUI Z1aBeji7b58QyCIMg8MWSzU9WX9zLo9hqgo0yOtgsR4/mV5tAf5QevioA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=GWLu28eBPALCf2r714Mtzvs71ZxGYTPDl8WxxwCHknM=; b=U5WiL5Xa aadBavvp0wJJN5UDZVsqGMLfF7Dg9/sgEShRGMyBcDcvhRvlZEFWpgcVJKzSvPCw OMuxfQuksBGTDcFuE+bJinQ7GM61zCB0tV1NNatzejXVuXe/PqCZ4bhWTeOJbw73 HUxHSKM3YBpInVUNT1iTjJxSRke7urizDpxEsMyY4Q9qAf/yvyIAEh6SNhkMWztt vqP+UPS2J7/aJm8wKbVZsl0oW9yZAi3lK6N/v09HTPnankJ71BLTv7+mRotSzyj8 yrALFHB9u/UDDoeLfVuGT+2ifXtBtES46ZXYLobf2tmFH0B+nvBPG8rCFuvBy+oa 214MPRRlwYDzGA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddthecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpedtiefhledvheevgeevjedvueffue ffkedtheduhfehtdegheeluedthefgteegteenucffohhmrghinhepfhhmuhhlrdhssgdp ghhnuhdrohhrghdplhhisgdufhhunhgtshdrshgsnecukfhppeejuddrfeeirddutddtrd dvvddtnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhep ghhnuhesuggrnhhivghlvghnghgvlhdrtghomh X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id 617FF1080064; Fri, 15 Jan 2021 06:32:00 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBVxZq023772; Fri, 15 Jan 2021 03:31:59 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 27/33] Import float multiplication from the CM0 library Date: Fri, 15 Jan 2021 03:30:55 -0800 Message-Id: <9233540e45cd29e74e65fa2878c06edcacf30c38.1610709584.git.gnu@danielengel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-13.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/eabi/fmul.S (__mulsf3): New file. * config/arm/lib1funcs.S: #include eabi/fmul.S (v6m only). * config/arm/t-elf (LIB1ASMFUNCS): Moved _mulsf3 to global scope (this object was previously blocked on v6m builds). --- libgcc/config/arm/eabi/fmul.S | 215 ++++++++++++++++++++++++++++++++++ libgcc/config/arm/lib1funcs.S | 1 + libgcc/config/arm/t-elf | 3 +- 3 files changed, 218 insertions(+), 1 deletion(-) create mode 100644 libgcc/config/arm/eabi/fmul.S diff --git a/libgcc/config/arm/eabi/fmul.S b/libgcc/config/arm/eabi/fmul.S new file mode 100644 index 00000000000..767de988f0b --- /dev/null +++ b/libgcc/config/arm/eabi/fmul.S @@ -0,0 +1,215 @@ +/* fmul.S: Thumb-1 optimized 32-bit float multiplication + + Copyright (C) 2018-2021 Free Software Foundation, Inc. + Contributed by Daniel Engel, Senva Inc (gnu@danielengel.com) + + This file is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the + Free Software Foundation; either version 3, or (at your option) any + later version. + + This file is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + + +#ifdef L_arm_mulsf3 + +// float __aeabi_fmul(float, float) +// Returns $r0 after multiplication by $r1. +// Subsection ordering within fpcore keeps conditional branches within range. +FUNC_START_SECTION aeabi_fmul .text.sorted.libgcc.fpcore.m.fmul +FUNC_ALIAS mulsf3 aeabi_fmul + CFI_START_FUNCTION + + // Standard registers, compatible with exception handling. + push { rT, lr } + .cfi_remember_state + .cfi_remember_state + .cfi_adjust_cfa_offset 8 + .cfi_rel_offset rT, 0 + .cfi_rel_offset lr, 4 + + // Save the sign of the result. + movs rT, r1 + eors rT, r0 + lsrs rT, #31 + lsls rT, #31 + mov ip, rT + + // Set up INF for comparison. + movs rT, #255 + lsls rT, #24 + + // Check for multiplication by zero. + lsls r2, r0, #1 + beq LLSYM(__fmul_zero1) + + lsls r3, r1, #1 + beq LLSYM(__fmul_zero2) + + // Check for INF/NAN. + cmp r3, rT + bhs LLSYM(__fmul_special2) + + cmp r2, rT + bhs LLSYM(__fmul_special1) + + // Because neither operand is INF/NAN, the result will be finite. + // It is now safe to modify the original operand registers. + lsls r0, #9 + + // Isolate the first exponent. When normal, add back the implicit '1'. + // The result is always aligned with the MSB in bit [31]. + // Subnormal mantissas remain effectively multiplied by 2x relative to + // normals, but this works because the weight of a subnormal is -126. + lsrs r2, #24 + beq LLSYM(__fmul_normalize2) + adds r0, #1 + rors r0, r0 + + LLSYM(__fmul_normalize2): + // IMPORTANT: exp10i() jumps in here! + // Repeat for the mantissa of the second operand. + // Short-circuit when the mantissa is 1.0, as the + // first mantissa is already prepared in $r0 + lsls r1, #9 + + // When normal, add back the implicit '1'. + lsrs r3, #24 + beq LLSYM(__fmul_go) + adds r1, #1 + rors r1, r1 + + LLSYM(__fmul_go): + // Calculate the final exponent, relative to bit [30]. + adds rT, r2, r3 + subs rT, #127 + + #if !defined(__OPTIMIZE_SIZE__) || !__OPTIMIZE_SIZE__ + // Short-circuit on multiplication by powers of 2. + lsls r3, r0, #1 + beq LLSYM(__fmul_simple1) + + lsls r3, r1, #1 + beq LLSYM(__fmul_simple2) + #endif + + // Save $ip across the call. + // (Alternatively, could push/pop a separate register, + // but the four instructions here are equivally fast) + // without imposing on the stack. + add rT, ip + + // 32x32 unsigned multiplication, 64 bit result. + bl SYM(__umulsidi3) __PLT__ + + // Separate the saved exponent and sign. + sxth r2, rT + subs rT, r2 + mov ip, rT + + b SYM(__fp_assemble) + + #if !defined(__OPTIMIZE_SIZE__) || !__OPTIMIZE_SIZE__ + LLSYM(__fmul_simple2): + // Move the high bits of the result to $r1. + movs r1, r0 + + LLSYM(__fmul_simple1): + // Clear the remainder. + eors r0, r0 + + // Adjust mantissa to match the exponent, relative to bit[30]. + subs r2, rT, #1 + b SYM(__fp_assemble) + #endif + + LLSYM(__fmul_zero1): + // $r0 was equal to 0, set up to check $r1 for INF/NAN. + lsls r2, r1, #1 + + LLSYM(__fmul_zero2): + #if defined(EXCEPTION_CODES) && EXCEPTION_CODES + movs r3, #(INFINITY_TIMES_ZERO) + #endif + + // Check the non-zero operand for INF/NAN. + // If NAN, it should be returned. + // If INF, the result should be NAN. + // Otherwise, the result will be +/-0. + cmp r2, rT + beq SYM(__fp_exception) + + // If the second operand is finite, the result is 0. + blo SYM(__fp_zero) + + #if defined(STRICT_NANS) && STRICT_NANS + // Restore values that got mixed in zero testing, then go back + // to sort out which one is the NAN. + lsls r3, r1, #1 + lsls r2, r0, #1 + #elif defined(TRAP_NANS) && TRAP_NANS + // Return NAN with the sign bit cleared. + lsrs r0, r2, #1 + b SYM(__fp_check_nan) + #else + lsrs r0, r2, #1 + // Return NAN with the sign bit cleared. + pop { rT, pc } + .cfi_restore_state + #endif + + LLSYM(__fmul_special2): + // $r1 is INF/NAN. In case of INF, check $r0 for NAN. + cmp r2, rT + + #if defined(TRAP_NANS) && TRAP_NANS + // Force swap if $r0 is not NAN. + bls LLSYM(__fmul_swap) + + // $r0 is NAN, keep if $r1 is INF + cmp r3, rT + beq LLSYM(__fmul_special1) + + // Both are NAN, keep the smaller value (more likely to signal). + cmp r2, r3 + #endif + + // Prefer the NAN already in $r0. + // (If TRAP_NANS, this is the smaller NAN). + bhi LLSYM(__fmul_special1) + + LLSYM(__fmul_swap): + movs r0, r1 + + LLSYM(__fmul_special1): + // $r0 is either INF or NAN. $r1 has already been examined. + // Flags are already set correctly. + lsls r2, r0, #1 + cmp r2, rT + beq SYM(__fp_infinity) + + #if defined(TRAP_NANS) && TRAP_NANS + b SYM(__fp_check_nan) + #else + pop { rT, pc } + .cfi_restore_state + #endif + + CFI_END_FUNCTION +FUNC_END mulsf3 +FUNC_END aeabi_fmul + +#endif /* L_arm_mulsf3 */ + diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S index 6c3f29b71e2..ffc343c37d3 100644 --- a/libgcc/config/arm/lib1funcs.S +++ b/libgcc/config/arm/lib1funcs.S @@ -2015,6 +2015,7 @@ LSYM(Lchange_\register): #include "eabi/fneg.S" #include "eabi/fadd.S" #include "eabi/futil.S" +#include "eabi/fmul.S" #endif /* NOT_ISA_TARGET_32BIT */ #include "eabi/lcmp.S" #endif /* !__symbian__ */ diff --git a/libgcc/config/arm/t-elf b/libgcc/config/arm/t-elf index c57d9ef50ac..682f273a1d2 100644 --- a/libgcc/config/arm/t-elf +++ b/libgcc/config/arm/t-elf @@ -10,7 +10,7 @@ THUMB1_ISA:=$(findstring __ARM_ARCH_ISA_THUMB 1,$(shell $(gcc_compile_bare) -dM # inclusion create when only multiplication is used, thus avoiding pulling in # useless division code. ifneq (__ARM_ARCH_ISA_THUMB 1,$(ARM_ISA)$(THUMB1_ISA)) -LIB1ASMFUNCS += _arm_muldf3 _arm_mulsf3 +LIB1ASMFUNCS += _arm_muldf3 endif endif # !__symbian__ @@ -26,6 +26,7 @@ LIB1ASMFUNCS += \ _ctzsi2 \ _paritysi2 \ _popcountsi2 \ + _arm_mulsf3 \ ifeq (__ARM_ARCH_ISA_THUMB 1,$(ARM_ISA)$(THUMB1_ISA)) # Group 0B: WEAK overridable function objects built for v6m only. From patchwork Fri Jan 15 11:30:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426923 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=p3h8dK3Q; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=bs2ZtpcR; dkim-atps=neutral Received: from sourceware.org (unknown [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJw6573qz9sXG for ; Fri, 15 Jan 2021 22:33:26 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3E27E398544F; Fri, 15 Jan 2021 11:32:07 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id 4B8B63982433 for ; Fri, 15 Jan 2021 11:32:04 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 4B8B63982433 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 5944BB7A; Fri, 15 Jan 2021 06:32:03 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 15 Jan 2021 06:32:03 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=FDEZRp6jZrsjN jfZfshqEAn1x2lmVxRTxUVtFhnYLCo=; b=p3h8dK3QLmydq8jX/PbDgaoKM/UbL u9OpWnoPmuZfCh2NZWFLMnCV7LAOd+xrS35Z57jRiGtfV4qS10QwQ6cuQ/32Tbn9 i+sT6M+xSnez2XRzIlDt3hWgyLieUPmg8xo+klulAW+f43fTqEmZeP6hF9IzCb9g d+JzQkxyNyXWR43vV5YMKCzwbeWQZNJ6kxNVMKnja5WtQ+zu71CGqmvqUw8jaHTH 9BgKUZGKSyUk6g2eRojW4rQitysVOKTQuaxMbt9zoHT+YGGE/IzuV2uNNgTKV8OO 17BsSk4m1x5FAAEdgHxeA7uKN4+fB7KmoaPekfvQ8K7MUiuHEzuYbP0DQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=FDEZRp6jZrsjNjfZfshqEAn1x2lmVxRTxUVtFhnYLCo=; b=bs2ZtpcR itA+niF9HGTGyefU+uHawee1/iFxohkvJh74dkcOhGLm0YSmNgvWGWjjMBO2W1/i 6VqsIJ/Pw2Eyh1F1idnM3+Sl7X8ShAm7W/a2phRY+VJTmfr1GfOdHmcdmCqNVYtp 8bDNzK9Ft6NaZB+8XR4xkPibnHkjbqbUW8cnyHhHiuevZso87yrJteJAeWYVLu2S q3csyLwPP9mCy8qzOVZQSh34xx9QjpnCTEV1kZqvtoB3CbYJTZ22SJa4/GvWwCms BPG2Qbtl8/TCYnfWSt8wzD4L50BVezOG5pjOAxFUSa74kZCAHdx6J/i12RcstXa0 soLziI83C18iLA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddthecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpedvleegtdejffelkeeiffduteejve eviefggfefjeegfeelgfeuteehiedvteehveenucffohhmrghinhepfhguihhvrdhssgdp ghhnuhdrohhrghdplhhisgdufhhunhgtshdrshgsnecukfhppeejuddrfeeirddutddtrd dvvddtnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhep ghhnuhesuggrnhhivghlvghnghgvlhdrtghomh X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id 76402108005B; Fri, 15 Jan 2021 06:32:02 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBW1L0023775; Fri, 15 Jan 2021 03:32:01 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 28/33] Import float division from the CM0 library Date: Fri, 15 Jan 2021 03:30:56 -0800 Message-Id: <17ff104042267ebce983e6bcaaaccf5aeb7cb544.1610709584.git.gnu@danielengel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-13.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" gcc/libgcc/ChangeLog: 2021-01-08 Daniel Engel * config/arm/eabi/fdiv.S (__divsf3, __fp_divloopf): New file. * config/arm/lib1funcs.S: #include eabi/fdiv.S (v6m only). * config/arm/t-elf (LIB1ASMFUNCS): Added _divsf3 and _fp_divloopf. --- libgcc/config/arm/eabi/fdiv.S | 261 ++++++++++++++++++++++++++++++++++ libgcc/config/arm/lib1funcs.S | 1 + libgcc/config/arm/t-elf | 2 + 3 files changed, 264 insertions(+) create mode 100644 libgcc/config/arm/eabi/fdiv.S diff --git a/libgcc/config/arm/eabi/fdiv.S b/libgcc/config/arm/eabi/fdiv.S new file mode 100644 index 00000000000..118f4e94676 --- /dev/null +++ b/libgcc/config/arm/eabi/fdiv.S @@ -0,0 +1,261 @@ +/* fdiv.S: Cortex M0 optimized 32-bit float division + + Copyright (C) 2018-2021 Free Software Foundation, Inc. + Contributed by Daniel Engel, Senva Inc (gnu@danielengel.com) + + This file is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the + Free Software Foundation; either version 3, or (at your option) any + later version. + + This file is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + + +#ifdef L_arm_divsf3 + +// float __aeabi_fdiv(float, float) +// Returns $r0 after division by $r1. +// Subsection ordering within fpcore keeps conditional branches within range. +FUNC_START_SECTION aeabi_fdiv .text.sorted.libgcc.fpcore.n.fdiv +FUNC_ALIAS divsf3 aeabi_fdiv + CFI_START_FUNCTION + + // Standard registers, compatible with exception handling. + push { rT, lr } + .cfi_remember_state + .cfi_remember_state + .cfi_adjust_cfa_offset 8 + .cfi_rel_offset rT, 0 + .cfi_rel_offset lr, 4 + + // Save for the sign of the result. + movs r3, r1 + eors r3, r0 + lsrs rT, r3, #31 + lsls rT, #31 + mov ip, rT + + // Set up INF for comparison. + movs rT, #255 + lsls rT, #24 + + // Check for divide by 0. Automatically catches 0/0. + lsls r2, r1, #1 + beq LLSYM(__fdiv_by_zero) + + // Check for INF/INF, or a number divided by itself. + lsls r3, #1 + beq LLSYM(__fdiv_equal) + + // Check the numerator for INF/NAN. + eors r3, r2 + cmp r3, rT + bhs LLSYM(__fdiv_special1) + + // Check the denominator for INF/NAN. + cmp r2, rT + bhs LLSYM(__fdiv_special2) + + // Check the numerator for zero. + cmp r3, #0 + beq SYM(__fp_zero) + + // No action if the numerator is subnormal. + // The mantissa will normalize naturally in the division loop. + lsls r0, #9 + lsrs r1, r3, #24 + beq LLSYM(__fdiv_denominator) + + // Restore the numerator's implicit '1'. + adds r0, #1 + rors r0, r0 + + LLSYM(__fdiv_denominator): + // The denominator must be normalized and left aligned. + bl SYM(__fp_normalize2) + + // 25 bits of precision will be sufficient. + movs rT, #64 + + // Run division. + bl SYM(__fp_divloopf) + b SYM(__fp_assemble) + + LLSYM(__fdiv_equal): + #if defined(EXCEPTION_CODES) && EXCEPTION_CODES + movs r3, #(DIVISION_INF_BY_INF) + #endif + + // The absolute value of both operands are equal, but not 0. + // If both operands are INF, create a new NAN. + cmp r2, rT + beq SYM(__fp_exception) + + #if defined(TRAP_NANS) && TRAP_NANS + // If both operands are NAN, return the NAN in $r0. + bhi SYM(__fp_check_nan) + #else + bhi LLSYM(__fdiv_return) + #endif + + // Return 1.0f, with appropriate sign. + movs r0, #127 + lsls r0, #23 + add r0, ip + + LLSYM(__fdiv_return): + pop { rT, pc } + .cfi_restore_state + + LLSYM(__fdiv_special2): + // The denominator is either INF or NAN, numerator is neither. + // Also, the denominator is not equal to 0. + // If the denominator is INF, the result goes to 0. + beq SYM(__fp_zero) + + // The only other option is NAN, fall through to branch. + mov r0, r1 + + LLSYM(__fdiv_special1): + #if defined(TRAP_NANS) && TRAP_NANS + // The numerator is INF or NAN. If NAN, return it directly. + bne SYM(__fp_check_nan) + #else + bne LLSYM(__fdiv_return) + #endif + + // If INF, the result will be INF if the denominator is finite. + // The denominator won't be either INF or 0, + // so fall through the exception trap to check for NAN. + movs r0, r1 + + LLSYM(__fdiv_by_zero): + #if defined(EXCEPTION_CODES) && EXCEPTION_CODES + movs r3, #(DIVISION_0_BY_0) + #endif + + // The denominator is 0. + // If the numerator is also 0, the result will be a new NAN. + // Otherwise the result will be INF, with the correct sign. + lsls r2, r0, #1 + beq SYM(__fp_exception) + + // The result should be NAN if the numerator is NAN. Otherwise, + // the result is INF regardless of the numerator value. + cmp r2, rT + + #if defined(TRAP_NANS) && TRAP_NANS + bhi SYM(__fp_check_nan) + #else + bhi LLSYM(__fdiv_return) + #endif + + // Recreate INF with the correct sign. + b SYM(__fp_infinity) + + CFI_END_FUNCTION +FUNC_END divsf3 +FUNC_END aeabi_fdiv + +#endif /* L_arm_divsf3 */ + + +#ifdef L_fp_divloopf + +// Division helper, possibly to be shared with atan2. +// Expects the numerator mantissa in $r0, exponent in $r1, +// plus the denominator mantissa in $r3, exponent in $r2, and +// a bit pattern in $rT that controls the result precision. +// Returns quotient in $r1, exponent in $r2, pseudo remainder in $r0. +// Subsection ordering within fpcore keeps conditional branches within range. +FUNC_START_SECTION fp_divloopf .text.sorted.libgcc.fpcore.o.fdiv2 + CFI_START_FUNCTION + + // Initialize the exponent, relative to bit[30]. + subs r2, r1, r2 + + SYM(__fp_divloopf2): + // The exponent should be (expN - 127) - (expD - 127) + 127. + // An additional offset of 25 is required to account for the + // minimum number of bits in the result (before rounding). + // However, drop '1' because the offset is relative to bit[30], + // while the result is calculated relative to bit[31]. + adds r2, #(127 + 25 - 1) + + #if !defined(__OPTIMIZE_SIZE__) || !__OPTIMIZE_SIZE__ + // Dividing by a power of 2? + lsls r1, r3, #1 + beq LLSYM(__fdiv_simple) + #endif + + // Initialize the result. + eors r1, r1 + + // Clear the MSB, so that when the numerator is smaller than + // the denominator, there is one bit free for a left shift. + // After a single shift, the numerator is guaranteed to be larger. + // The denominator ends up in r3, and the numerator ends up in r0, + // so that the numerator serves as a psuedo-remainder in rounding. + // Shift the numerator one additional bit to compensate for the + // pre-incrementing loop. + lsrs r0, #2 + lsrs r3, #1 + + LLSYM(__fdiv_loop): + // Once the MSB of the output reaches the MSB of the register, + // the result has been calculated to the required precision. + lsls r1, #1 + bmi LLSYM(__fdiv_break) + + // Shift the numerator/remainder left to set up the next bit. + subs r2, #1 + lsls r0, #1 + + // Test if the numerator/remainder is smaller than the denominator, + // do nothing if it is. + cmp r0, r3 + blo LLSYM(__fdiv_loop) + + // If the numerator/remainder is greater or equal, set the next bit, + // and subtract the denominator. + adds r1, rT + subs r0, r3 + + // Short-circuit if the remainder goes to 0. + // Even with the overhead of "subnormal" alignment, + // this is usually much faster than continuing. + bne LLSYM(__fdiv_loop) + + // Compensate the alignment of the result. + // The remainder does not need compensation, it's already 0. + lsls r1, #1 + + LLSYM(__fdiv_break): + RET + + #if !defined(__OPTIMIZE_SIZE__) || !__OPTIMIZE_SIZE__ + LLSYM(__fdiv_simple): + // The numerator becomes the result, with a remainder of 0. + movs r1, r0 + eors r0, r0 + subs r2, #25 + RET + #endif + + CFI_END_FUNCTION +FUNC_END fp_divloopf + +#endif /* L_fp_divloopf */ + diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S index ffc343c37d3..98fb544517e 100644 --- a/libgcc/config/arm/lib1funcs.S +++ b/libgcc/config/arm/lib1funcs.S @@ -2016,6 +2016,7 @@ LSYM(Lchange_\register): #include "eabi/fadd.S" #include "eabi/futil.S" #include "eabi/fmul.S" +#include "eabi/fdiv.S" #endif /* NOT_ISA_TARGET_32BIT */ #include "eabi/lcmp.S" #endif /* !__symbian__ */ diff --git a/libgcc/config/arm/t-elf b/libgcc/config/arm/t-elf index 682f273a1d2..1812a1e1a99 100644 --- a/libgcc/config/arm/t-elf +++ b/libgcc/config/arm/t-elf @@ -98,10 +98,12 @@ LIB1ASMFUNCS += \ _arm_eqsf2 \ _arm_gesf2 \ _arm_frsubsf3 \ + _arm_divsf3 \ _fp_exceptionf \ _fp_checknanf \ _fp_assemblef \ _fp_normalizef \ + _fp_divloopf \ endif From patchwork Fri Jan 15 11:30:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426924 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=bkxnvD9o; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=KzTN6xgq; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJwB17M5z9sWc for ; Fri, 15 Jan 2021 22:33:30 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A151C398243E; Fri, 15 Jan 2021 11:32:09 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id 551123982433 for ; Fri, 15 Jan 2021 11:32:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 551123982433 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 6E145EBC; Fri, 15 Jan 2021 06:32:05 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 15 Jan 2021 06:32:05 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=/VvdI0xPC2aUf HNm1Rbfd5ZuV7ro35idfolshF/n7N4=; b=bkxnvD9o/8axH6SeP2Jw5tFDSRCeS gnY9yPt/YdFHRYxZVCQBmS7ZOVT26E5BBjFw5pn8xQ32f1xlpC416C8dtJtlh675 bm3Q5+HUKNrkBcwDM/Qme7QYmWZpqiL7jhn1gpu1j+nMAAZt5qnrJyPPhuGZeBTm 5oM0QVNfzJU6Ttj/R9c1vvktqFPlrVxACwiPGckg6G799P/v7wG6crmxH0EReMWM xcl4zNyrzdOmJQwgade5JFORL43qAaO6lhhiHaSDRVv2JWVR4ICD2QQTJKrWUgKn Nsn1APYP2rx/3P7B1VIos0HXEmx8UI4MilYBwNMZlgB8CKAOXeOe+Rkqg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=/VvdI0xPC2aUfHNm1Rbfd5ZuV7ro35idfolshF/n7N4=; b=KzTN6xgq wBx2ohNs5TMfkm9bJckPsk0XoUIA5h02gaa37T5VFhyERv3lUnDzVkmRsA10iwST LF408rI0L6hKx+1sk+8YccLymWmkv8Il5AanVg3Tf5whv7GPPZxBfpUHzDw7GTuq g03LgdxNYCnIhjU1ykti6WEb9vIs52KVlxiEY1GkAPAet+yDEd+JP3ivDXqF7aVc 6QKWFv3Qs04DauePM2Ddaz7iz6eBcOmv+xGRPLq+3sfofKknt06K1GGj4cWDGU9w AOcCcIvVtHSMIJ1pDXfjr3YWxMkm9eYKqP4pV/GaURI5r2SHV+8ANr5NsCyUg+eH /7sKoPKC5Dus+w== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddthecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpeethedtffefhfeuveffhfdvhfeule ejudeujeevvdffkeevffeufeduleduvdduieenucffohhmrghinhepfhhflhhorghtrdhs sgdpghhnuhdrohhrghdplhhisgdufhhunhgtshdrshgsnecukfhppeejuddrfeeirddutd dtrddvvddtnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhho mhepghhnuhesuggrnhhivghlvghnghgvlhdrtghomh X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id 94DBE108005B; Fri, 15 Jan 2021 06:32:04 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBW3gZ023778; Fri, 15 Jan 2021 03:32:03 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 29/33] Import integer-to-float conversion from the CM0 library Date: Fri, 15 Jan 2021 03:30:57 -0800 Message-Id: <9c2003ea1fc59dae6ca588a3dc80cbcf17f8b2be.1610709584.git.gnu@danielengel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-13.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/bpabi-lib.h (__floatdisf, __floatundisf): Remove obsolete RENAME_LIBRARY directives. * config/arm/eabi/ffloat.S (__aeabi_i2f, __aeabi_l2f, __aeabi_ui2f, __aeabi_ul2f): New file. * config/arm/lib1funcs.S: #include eabi/ffloat.S (v6m only). * config/arm/t-elf (LIB1ASMFUNCS): Added _arm_floatunsisf, _arm_floatsisf, and _internal_floatundisf. Moved _arm_floatundisf to the weak function group --- libgcc/config/arm/bpabi-lib.h | 6 - libgcc/config/arm/eabi/ffloat.S | 247 ++++++++++++++++++++++++++++++++ libgcc/config/arm/lib1funcs.S | 1 + libgcc/config/arm/t-elf | 5 +- 4 files changed, 252 insertions(+), 7 deletions(-) create mode 100644 libgcc/config/arm/eabi/ffloat.S diff --git a/libgcc/config/arm/bpabi-lib.h b/libgcc/config/arm/bpabi-lib.h index 3cb90b4b345..1e651ead4ac 100644 --- a/libgcc/config/arm/bpabi-lib.h +++ b/libgcc/config/arm/bpabi-lib.h @@ -56,9 +56,6 @@ #ifdef L_floatdidf #define DECLARE_LIBRARY_RENAMES RENAME_LIBRARY (floatdidf, l2d) #endif -#ifdef L_floatdisf -#define DECLARE_LIBRARY_RENAMES RENAME_LIBRARY (floatdisf, l2f) -#endif /* These renames are needed on ARMv6M. Other targets get them from assembly routines. */ @@ -71,9 +68,6 @@ #ifdef L_floatundidf #define DECLARE_LIBRARY_RENAMES RENAME_LIBRARY (floatundidf, ul2d) #endif -#ifdef L_floatundisf -#define DECLARE_LIBRARY_RENAMES RENAME_LIBRARY (floatundisf, ul2f) -#endif /* For ARM bpabi, we only want to use a "__gnu_" prefix for the fixed-point helper functions - not everything in libgcc - in the interests of diff --git a/libgcc/config/arm/eabi/ffloat.S b/libgcc/config/arm/eabi/ffloat.S new file mode 100644 index 00000000000..9690ab85081 --- /dev/null +++ b/libgcc/config/arm/eabi/ffloat.S @@ -0,0 +1,247 @@ +/* ffixed.S: Thumb-1 optimized integer-to-float conversion + + Copyright (C) 2018-2021 Free Software Foundation, Inc. + Contributed by Daniel Engel, Senva Inc (gnu@danielengel.com) + + This file is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the + Free Software Foundation; either version 3, or (at your option) any + later version. + + This file is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + + +#ifdef L_arm_floatsisf + +// float __aeabi_i2f(int) +// Converts a signed integer in $r0 to float. + +// On little-endian cores (including all Cortex-M), __floatsisf() can be +// implemented as below in 5 instructions. However, it can also be +// implemented by prefixing a single instruction to __floatdisf(). +// A memory savings of 4 instructions at a cost of only 2 execution cycles +// seems reasonable enough. Plus, the trade-off only happens in programs +// that require both __floatsisf() and __floatdisf(). Programs only using +// __floatsisf() always get the smallest version. +// When the combined version will be provided, this standalone version +// must be declared WEAK, so that the combined version can supersede it. +// '_arm_floatsisf' should appear before '_arm_floatdisf' in LIB1ASMFUNCS. +// Same parent section as __ul2f() to keep tail call branch within range. +#if defined(__OPTIMIZE_SIZE__) && __OPTIMIZE_SIZE__ +WEAK_START_SECTION aeabi_i2f .text.sorted.libgcc.fpcore.p.floatsisf +WEAK_ALIAS floatsisf aeabi_i2f + CFI_START_FUNCTION + +#else /* !__OPTIMIZE_SIZE__ */ +FUNC_START_SECTION aeabi_i2f .text.sorted.libgcc.fpcore.p.floatsisf +FUNC_ALIAS floatsisf aeabi_i2f + CFI_START_FUNCTION + +#endif /* !__OPTIMIZE_SIZE__ */ + + // Save the sign. + asrs r3, r0, #31 + + // Absolute value of the input. + eors r0, r3 + subs r0, r3 + + // Sign extension to long long unsigned. + eors r1, r1 + b SYM(__internal_floatundisf_noswap) + + CFI_END_FUNCTION +FUNC_END floatsisf +FUNC_END aeabi_i2f + +#endif /* L_arm_floatsisf */ + + +#ifdef L_arm_floatdisf + +// float __aeabi_l2f(long long) +// Converts a signed 64-bit integer in $r1:$r0 to a float in $r0. +// See build comments for __floatsisf() above. +// Same parent section as __ul2f() to keep tail call branch within range. +#if defined(__OPTIMIZE_SIZE__) && __OPTIMIZE_SIZE__ +FUNC_START_SECTION aeabi_i2f .text.sorted.libgcc.fpcore.p.floatdisf +FUNC_ALIAS floatsisf aeabi_i2f + CFI_START_FUNCTION + + #if defined(__ARMEB__) && __ARMEB__ + // __floatdisf() expects a big-endian lower word in $r1. + movs xxl, r0 + #endif + + // Sign extension to long long signed. + asrs xxh, xxl, #31 + + FUNC_ENTRY aeabi_l2f + FUNC_ALIAS floatdisf aeabi_l2f + +#else /* !__OPTIMIZE_SIZE__ */ +FUNC_START_SECTION aeabi_l2f .text.sorted.libgcc.fpcore.p.floatdisf +FUNC_ALIAS floatdisf aeabi_l2f + CFI_START_FUNCTION + +#endif + + // Save the sign. + asrs r3, xxh, #31 + + // Absolute value of the input. + // Could this be arranged in big-endian mode so that this block also + // swapped the input words? Maybe. But, since neither 'eors' nor + // 'sbcs' allow a third destination register, it seems unlikely to + // save more than one cycle. Also, the size of __floatdisf() and + // __floatundisf() together would increase by two instructions. + eors xxl, r3 + eors xxh, r3 + subs xxl, r3 + sbcs xxh, r3 + + b SYM(__internal_floatundisf) + + CFI_END_FUNCTION +FUNC_END floatdisf +FUNC_END aeabi_l2f + +#if defined(__OPTIMIZE_SIZE__) && __OPTIMIZE_SIZE__ +FUNC_END floatsisf +FUNC_END aeabi_i2f +#endif + +#endif /* L_arm_floatsisf || L_arm_floatdisf */ + + +#ifdef L_arm_floatunsisf + +// float __aeabi_ui2f(unsigned) +// Converts an unsigned integer in $r0 to float. +FUNC_START_SECTION aeabi_ui2f .text.sorted.libgcc.fpcore.q.floatunsisf +FUNC_ALIAS floatunsisf aeabi_ui2f + CFI_START_FUNCTION + + #if defined(__ARMEB__) && __ARMEB__ + // In big-endian mode, function flow breaks down. __floatundisf() + // wants to swap word order, but __floatunsisf() does not. The + // The choice is between leaving these arguments un-swapped and + // branching, or canceling out the word swap in advance. + // The branching version would require one extra instruction to + // clear the sign ($r3) because of __floatdisf() dependencies. + // While the branching version is technically one cycle faster + // on the Cortex-M0 pipeline, branchless just feels better. + + // Thus, __floatundisf() expects a big-endian lower word in $r1. + movs xxl, r0 + #endif + + // Extend to unsigned long long and fall through. + eors xxh, xxh + +#endif /* L_arm_floatunsisf */ + + +// The execution of __floatunsisf() flows directly into __floatundisf(), such +// that instructions must appear consecutively in the same memory section +// for proper flow control. However, this construction inhibits the ability +// to discard __floatunsisf() when only using __floatundisf(). +// Additionally, both __floatsisf() and __floatdisf() expect to tail call +// __internal_floatundisf() with a sign argument. The __internal_floatundisf() +// symbol itself is unambiguous, but there is a remote risk that the linker +// will prefer some other symbol in place of __floatsisf() or __floatdisf(). +// As a workaround, this block configures __internal_floatundisf() three times. +// The first version provides __internal_floatundisf() as a WEAK standalone +// symbol. The second provides __floatundisf() and __internal_floatundisf(), +// still as weak symbols. The third provides __floatunsisf() normally, but +// __floatundisf() remains weak in case the linker prefers another version. +// '_internal_floatundisf', '_arm_floatundisf', and '_arm_floatunsisf' should +// appear in the given order in LIB1ASMFUNCS. +#if defined(L_arm_floatunsisf) || defined(L_arm_floatundisf) || \ + defined(L_internal_floatundisf) + +#define UL2F_SECTION .text.sorted.libgcc.fpcore.q.floatundisf + +#if defined(L_arm_floatundisf) +// float __aeabi_ul2f(unsigned long long) +// Converts an unsigned 64-bit integer in $r1:$r0 to a float in $r0. +WEAK_START_SECTION aeabi_ul2f UL2F_SECTION +WEAK_ALIAS floatundisf aeabi_ul2f + CFI_START_FUNCTION +#elif defined(L_arm_floatunsisf) +FUNC_ENTRY aeabi_ul2f +FUNC_ALIAS floatundisf aeabi_ul2f +#endif + +#if defined(L_arm_floatundisf) || defined(L_arm_floatunsisf) + // Sign is always positive. + eors r3, r3 +#endif + +#if defined(L_arm_floatunsisf) + // float internal_floatundisf(unsigned long long, int) + // Internal function expects the sign of the result in $r3[0]. + FUNC_ENTRY internal_floatundisf + +#elif defined(L_arm_floatundisf) + WEAK_ENTRY internal_floatundisf + +#else /* L_internal_floatundisf */ + WEAK_START_SECTION internal_floatundisf UL2F_SECTION + CFI_START_FUNCTION + +#endif + + #if defined(__ARMEB__) && __ARMEB__ + // Swap word order for register compatibility with __fp_assemble(). + // Could this be optimized by re-defining __fp_assemble()? Maybe. + // But the ramifications of dynamic register assignment on all + // the other callers of __fp_assemble() would be enormous. + eors r0, r1 + eors r1, r0 + eors r0, r1 + #endif + +#ifdef L_arm_floatunsisf + FUNC_ENTRY internal_floatundisf_noswap +#else /* L_arm_floatundisf || L_internal_floatundisf */ + WEAK_ENTRY internal_floatundisf_noswap +#endif + // Default exponent, relative to bit[30] of $r1. + movs r2, #(127 - 1 + 63) + + // Format the sign. + lsls r3, #31 + mov ip, r3 + + push { rT, lr } + b SYM(__fp_assemble) + + CFI_END_FUNCTION +FUNC_END internal_floatundisf_noswap +FUNC_END internal_floatundisf + +#if defined(L_arm_floatundisf) || defined(L_arm_floatunsisf) +FUNC_END floatundisf +FUNC_END aeabi_ul2f +#endif + +#if defined(L_arm_floatunsisf) +FUNC_END floatunsisf +FUNC_END aeabi_ui2f +#endif + +#endif /* L_arm_floatunsisf || L_arm_floatundisf */ + diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S index 98fb544517e..26737edc6f6 100644 --- a/libgcc/config/arm/lib1funcs.S +++ b/libgcc/config/arm/lib1funcs.S @@ -2017,6 +2017,7 @@ LSYM(Lchange_\register): #include "eabi/futil.S" #include "eabi/fmul.S" #include "eabi/fdiv.S" +#include "eabi/ffloat.S" #endif /* NOT_ISA_TARGET_32BIT */ #include "eabi/lcmp.S" #endif /* !__symbian__ */ diff --git a/libgcc/config/arm/t-elf b/libgcc/config/arm/t-elf index 1812a1e1a99..645d20f5f1c 100644 --- a/libgcc/config/arm/t-elf +++ b/libgcc/config/arm/t-elf @@ -26,14 +26,17 @@ LIB1ASMFUNCS += \ _ctzsi2 \ _paritysi2 \ _popcountsi2 \ + _arm_floatundisf \ _arm_mulsf3 \ ifeq (__ARM_ARCH_ISA_THUMB 1,$(ARM_ISA)$(THUMB1_ISA)) # Group 0B: WEAK overridable function objects built for v6m only. LIB1ASMFUNCS += \ _internal_cmpsf2 \ + _internal_floatundisf \ _muldi3 \ _arm_addsf3 \ + _arm_floatsisf \ endif @@ -78,7 +81,6 @@ LIB1ASMFUNCS += \ _arm_fixsfsi \ _arm_fixunssfsi \ _arm_floatdisf \ - _arm_floatundisf \ _arm_muldivsf3 \ _arm_negsf2 \ _arm_unordsf2 \ @@ -99,6 +101,7 @@ LIB1ASMFUNCS += \ _arm_gesf2 \ _arm_frsubsf3 \ _arm_divsf3 \ + _arm_floatunsisf \ _fp_exceptionf \ _fp_checknanf \ _fp_assemblef \ From patchwork Fri Jan 15 11:30:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426925 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=pRwYTfie; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=I/naD5Eg; dkim-atps=neutral Received: from sourceware.org (unknown [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJwG1T0Bz9sXG for ; Fri, 15 Jan 2021 22:33:34 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 83582398545B; Fri, 15 Jan 2021 11:32:11 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id 5DD673982415 for ; Fri, 15 Jan 2021 11:32:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 5DD673982415 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 75ECDB7A; Fri, 15 Jan 2021 06:32:07 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 15 Jan 2021 06:32:07 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=UURYB7/A2yGOO DpmQBXPBZbJgLi30LZbdkPI8J782Tg=; b=pRwYTfiefZLmMv/OEu5uxhcIKqBZy OFe4BM7cmfY+itBds0OFbyL+LdTrNfLBkw8ySUJbauhr3ixNCBJa7iEdpHitqf+d Ev12wWDVVUA6cs746ZYQO4/vPb4RU4U1abPBWMYo8zlEU54OD6KdPyYZyI1hjEGW xIE2XYVnkyK/WE70MXANxMnP165S2cwy/Qw3oVCE39TuHQVzsD5Zwzvyqjlvv1jI KDzfmBwZjMtNyfLke/076mTKzyoTDolsTMdXWfaIypwfP0rTzoVhWzuq2En8oFu/ biAbnlzfWil0GBeCGsNkf+Q+q0Mg8USkDSAUCXtuzSYMqDfpy4XUwNISw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=UURYB7/A2yGOODpmQBXPBZbJgLi30LZbdkPI8J782Tg=; b=I/naD5Eg XXx2JoPkTMzmTMP7yaVsB9CNctEC7Ay7L8FZGlwbZMDVgzFPDBWh5ctEKWt/jgrH FM6/65tcWF2lLbpHpJQnn0djzZo1QNM1eh8P+Qe9R23sLAoMviUX2UbfVAu0klMi UOF8eJpF8a4HLQw8B3gadGdOVgipmaw7FY82lHUgrQI6QdfLRrM5ZtOtXxWbIyY9 yZUPAotI8jQ/9LV/sXND9/1J1TGLpdukDbc/6TqD72etfuo77U8Qo3pOAY2S2pSs eYeRXOfBwyMY0tcGRMRZNUKRYqyXSmEUVMrZamiu9dAfm7DoNzEl6uFwV7eKKlNW mCsZbNRvrQBUsQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddthecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpeegtdehffelteelfffftddthfffve euieeiteeftdefgfelheeftedutdevhfehjeenucffohhmrghinhepfhhfihigvggurdhs sgdpghhnuhdrohhrghdplhhisgdufhhunhgtshdrshgsnecukfhppeejuddrfeeirddutd dtrddvvddtnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhho mhepghhnuhesuggrnhhivghlvghnghgvlhdrtghomh X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id A1F451080063; Fri, 15 Jan 2021 06:32:06 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBW5n9023781; Fri, 15 Jan 2021 03:32:05 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 30/33] Import float-to-integer conversion from the CM0 library Date: Fri, 15 Jan 2021 03:30:58 -0800 Message-Id: <1715ad3348b588ae37fa784064d17164c40911dd.1610709584.git.gnu@danielengel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-13.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/bpabi-lib.h (muldi3): Removed duplicate. (fixunssfsi) Removed obsolete RENAME_LIBRARY directive. * config/arm/eabi/ffixed.S (__aeabi_f2iz, __aeabi_f2uiz, __aeabi_f2lz, __aeabi_f2ulz): New file. * config/arm/lib1funcs.S: #include eabi/ffixed.S (v6m only). * config/arm/t-elf (LIB1ASMFUNCS): Added _internal_fixsfdi, _internal_fixsfsi, _arm_fixsfdi, and _arm_fixunssfdi. --- libgcc/config/arm/bpabi-lib.h | 6 - libgcc/config/arm/eabi/ffixed.S | 414 ++++++++++++++++++++++++++++++++ libgcc/config/arm/lib1funcs.S | 1 + libgcc/config/arm/t-elf | 4 + 4 files changed, 419 insertions(+), 6 deletions(-) create mode 100644 libgcc/config/arm/eabi/ffixed.S diff --git a/libgcc/config/arm/bpabi-lib.h b/libgcc/config/arm/bpabi-lib.h index 1e651ead4ac..a1c631640bb 100644 --- a/libgcc/config/arm/bpabi-lib.h +++ b/libgcc/config/arm/bpabi-lib.h @@ -32,9 +32,6 @@ #ifdef L_muldi3 #define DECLARE_LIBRARY_RENAMES RENAME_LIBRARY (muldi3, lmul) #endif -#ifdef L_muldi3 -#define DECLARE_LIBRARY_RENAMES RENAME_LIBRARY (muldi3, lmul) -#endif #ifdef L_fixdfdi #define DECLARE_LIBRARY_RENAMES RENAME_LIBRARY (fixdfdi, d2lz) \ extern DWtype __fixdfdi (DFtype) __attribute__((pcs("aapcs"))); \ @@ -62,9 +59,6 @@ #ifdef L_fixunsdfsi #define DECLARE_LIBRARY_RENAMES RENAME_LIBRARY (fixunsdfsi, d2uiz) #endif -#ifdef L_fixunssfsi -#define DECLARE_LIBRARY_RENAMES RENAME_LIBRARY (fixunssfsi, f2uiz) -#endif #ifdef L_floatundidf #define DECLARE_LIBRARY_RENAMES RENAME_LIBRARY (floatundidf, ul2d) #endif diff --git a/libgcc/config/arm/eabi/ffixed.S b/libgcc/config/arm/eabi/ffixed.S new file mode 100644 index 00000000000..8ced3a701ff --- /dev/null +++ b/libgcc/config/arm/eabi/ffixed.S @@ -0,0 +1,414 @@ +/* ffixed.S: Thumb-1 optimized float-to-integer conversion + + Copyright (C) 2018-2021 Free Software Foundation, Inc. + Contributed by Daniel Engel, Senva Inc (gnu@danielengel.com) + + This file is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the + Free Software Foundation; either version 3, or (at your option) any + later version. + + This file is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + + +// The implementation of __aeabi_f2uiz() expects to tail call __internal_f2iz() +// with the flags register set for unsigned conversion. The __internal_f2iz() +// symbol itself is unambiguous, but there is a remote risk that the linker +// will prefer some other symbol in place of __aeabi_f2iz(). Importing an +// archive file that exports __aeabi_f2iz() will throw an error in this case. +// As a workaround, this block configures __aeabi_f2iz() for compilation twice. +// The first version configures __internal_f2iz() as a WEAK standalone symbol, +// and the second exports __aeabi_f2iz() and __internal_f2iz() normally. +// A small bonus: programs only using __aeabi_f2uiz() will be slightly smaller. +// '_internal_fixsfsi' should appear before '_arm_fixsfsi' in LIB1ASMFUNCS. +#if defined(L_arm_fixsfsi) || \ + (defined(L_internal_fixsfsi) && \ + !(defined(__OPTIMIZE_SIZE__) && __OPTIMIZE_SIZE__)) + +// Subsection ordering within fpcore keeps conditional branches within range. +#define F2IZ_SECTION .text.sorted.libgcc.fpcore.r.fixsfsi + +// int __aeabi_f2iz(float) +// Converts a float in $r0 to signed integer, rounding toward 0. +// Values out of range are forced to either INT_MAX or INT_MIN. +// NAN becomes zero. +#ifdef L_arm_fixsfsi +FUNC_START_SECTION aeabi_f2iz F2IZ_SECTION +FUNC_ALIAS fixsfsi aeabi_f2iz + CFI_START_FUNCTION +#endif + + #if defined(__OPTIMIZE_SIZE__) && __OPTIMIZE_SIZE__ + // Flag for unsigned conversion. + movs r1, #33 + b SYM(__internal_fixsfdi) + + #else /* !__OPTIMIZE_SIZE__ */ + +#ifdef L_arm_fixsfsi + // Flag for signed conversion. + movs r3, #1 + + // [unsigned] int internal_f2iz(float, int) + // Internal function expects a boolean flag in $r1. + // If the boolean flag is 0, the result is unsigned. + // If the boolean flag is 1, the result is signed. + FUNC_ENTRY internal_f2iz + +#else /* L_internal_fixsfsi */ + WEAK_START_SECTION internal_f2iz F2IZ_SECTION + CFI_START_FUNCTION + +#endif + + // Isolate the sign of the result. + asrs r1, r0, #31 + lsls r0, #1 + + #if defined(FP_EXCEPTION) && FP_EXCEPTION + // Check for zero to avoid spurious underflow exception on -0. + beq LLSYM(__f2iz_return) + #endif + + // Isolate the exponent. + lsrs r2, r0, #24 + + #if defined(TRAP_NANS) && TRAP_NANS + // Test for NAN. + // Otherwise, NAN will be converted like +/-INF. + cmp r2, #255 + beq LLSYM(__f2iz_nan) + #endif + + // Extract the mantissa and restore the implicit '1'. Technically, + // this is wrong for subnormals, but they flush to zero regardless. + lsls r0, #8 + adds r0, #1 + rors r0, r0 + + // Calculate mantissa alignment. Given the implicit '1' in bit[31]: + // * An exponent less than 127 will automatically flush to 0. + // * An exponent of 127 will result in a shift of 31. + // * An exponent of 128 will result in a shift of 30. + // * ... + // * An exponent of 157 will result in a shift of 1. + // * An exponent of 158 will result in no shift at all. + // * An exponent larger than 158 will result in overflow. + rsbs r2, #0 + adds r2, #158 + + // When the shift is less than minimum, the result will overflow. + // The only signed value to fail this test is INT_MIN (0x80000000), + // but it will be returned correctly from the overflow branch. + cmp r2, r3 + blt LLSYM(__f2iz_overflow) + + // If unsigned conversion of a negative value, also overflow. + // Would also catch -0.0f if not handled earlier. + cmn r3, r1 + blt LLSYM(__f2iz_overflow) + + #if defined(FP_EXCEPTION) && FP_EXCEPTION + // Save a copy for remainder testing + movs r3, r0 + #endif + + // Truncate the fraction. + lsrs r0, r2 + + // Two's complement negation, if applicable. + // Bonus: the sign in $r1 provides a suitable long long result. + eors r0, r1 + subs r0, r1 + + #if defined(FP_EXCEPTION) && FP_EXCEPTION + // If any bits set in the remainder, raise FE_INEXACT + rsbs r2, #0 + adds r2, #32 + lsls r3, r2 + bne LLSYM(__f2iz_inexact) + #endif + + LLSYM(__f2iz_return): + RET + + LLSYM(__f2iz_overflow): + // Positive unsigned integers (r1 == 0, r3 == 0), return 0xFFFFFFFF. + // Negative unsigned integers (r1 == -1, r3 == 0), return 0x00000000. + // Positive signed integers (r1 == 0, r3 == 1), return 0x7FFFFFFF. + // Negative signed integers (r1 == -1, r3 == 1), return 0x80000000. + // TODO: FE_INVALID exception, (but not for -2^31). + mvns r0, r1 + lsls r3, #31 + eors r0, r3 + RET + + #if defined(FP_EXCEPTION) && FP_EXCEPTION + LLSYM(__f2iz_inexact): + // TODO: Another class of exceptions that doesn't overwrite $r0. + bkpt #0 + + #if defined(EXCEPTION_CODES) && EXCEPTION_CODES + movs r3, #(CAST_INEXACT) + #endif + + b SYM(__fp_exception) + #endif + + LLSYM(__f2iz_nan): + // Check for INF + lsls r2, r0, #9 + beq LLSYM(__f2iz_overflow) + + #if defined(FP_EXCEPTION) && FP_EXCEPTION + #if defined(EXCEPTION_CODES) && EXCEPTION_CODES + movs r3, #(CAST_UNDEFINED) + #endif + + b SYM(__fp_exception) + #endif + + #if defined(TRAP_NANS) && TRAP_NANS + + // TODO: Extend to long long + + // TODO: bl fp_check_nan + #endif + + // Return long long 0 on NAN. + eors r0, r0 + eors r1, r1 + RET + +FUNC_END internal_f2iz + + #endif /* !__OPTIMIZE_SIZE__ */ + + CFI_END_FUNCTION + +#ifdef L_arm_fixsfsi +FUNC_END fixsfsi +FUNC_END aeabi_f2iz +#endif + +#endif /* L_arm_fixsfsi || L_internal_fixsfsi */ + + +#ifdef L_arm_fixunssfsi + +// unsigned int __aeabi_f2uiz(float) +// Converts a float in $r0 to unsigned integer, rounding toward 0. +// Values out of range are forced to UINT_MAX. +// Negative values and NAN all become zero. +// Subsection ordering within fpcore keeps conditional branches within range. +FUNC_START_SECTION aeabi_f2uiz .text.sorted.libgcc.fpcore.s.fixunssfsi +FUNC_ALIAS fixunssfsi aeabi_f2uiz + CFI_START_FUNCTION + + #if defined(__OPTIMIZE_SIZE__) && __OPTIMIZE_SIZE__ + // Flag for unsigned conversion. + movs r1, #32 + b SYM(__internal_fixsfdi) + + #else /* !__OPTIMIZE_SIZE__ */ + // Flag for unsigned conversion. + movs r3, #0 + b SYM(__internal_f2iz) + + #endif /* !__OPTIMIZE_SIZE__ */ + + CFI_END_FUNCTION +FUNC_END fixunssfsi +FUNC_END aeabi_f2uiz + +#endif /* L_arm_fixunssfsi */ + + +// The implementation of __aeabi_f2ulz() expects to tail call __internal_fixsfdi() +// with the flags register set for unsigned conversion. The __internal_fixsfdi() +// symbol itself is unambiguous, but there is a remote risk that the linker +// will prefer some other symbol in place of __aeabi_f2lz(). Importing an +// archive file that exports __aeabi_f2lz() will throw an error in this case. +// As a workaround, this block configures __aeabi_f2lz() for compilation twice. +// The first version configures __internal_fixsfdi() as a WEAK standalone symbol, +// and the second exports __aeabi_f2lz() and __internal_fixsfdi() normally. +// A small bonus: programs only using __aeabi_f2ulz() will be slightly smaller. +// '_internal_fixsfdi' should appear before '_arm_fixsfdi' in LIB1ASMFUNCS. +#if defined(L_arm_fixsfdi) || defined(L_internal_fixsfdi) + +// Subsection ordering within fpcore keeps conditional branches within range. +#define F2LZ_SECTION .text.sorted.libgcc.fpcore.t.fixsfdi + +// long long aeabi_f2lz(float) +// Converts a float in $r0 to a 64 bit integer in $r1:$r0, rounding toward 0. +// Values out of range are forced to either INT64_MAX or INT64_MIN. +// NAN becomes zero. +#ifdef L_arm_fixsfdi +FUNC_START_SECTION aeabi_f2lz F2LZ_SECTION +FUNC_ALIAS fixsfdi aeabi_f2lz + CFI_START_FUNCTION + + movs r1, #1 + + // [unsigned] long long int internal_fixsfdi(float, int) + // Internal function expects a shift flag in $r1. + // If the shift is flag 0, the result is unsigned. + // If the shift is flag is 1, the result is signed. + // If the shift is flag is 33, the result is signed int. + FUNC_ENTRY internal_fixsfdi + +#else /* L_internal_fixsfdi */ + WEAK_START_SECTION internal_fixsfdi F2LZ_SECTION + CFI_START_FUNCTION + +#endif + + // Split the sign of the result from the mantissa/exponent field. + // Handle +/-0 specially to avoid spurious exceptions. + asrs r3, r0, #31 + lsls r0, #1 + beq LLSYM(__f2lz_zero) + + // If unsigned conversion of a negative value, also overflow. + // Specifically, is the LSB of $r1 clear when $r3 is equal to '-1'? + // + // $r3 (sign) >= $r2 (flag) + // 0xFFFFFFFF false 0x00000000 + // 0x00000000 true 0x00000000 + // 0xFFFFFFFF true 0x80000000 + // 0x00000000 true 0x80000000 + // + // (NOTE: This test will also trap -0.0f, unless handled earlier.) + lsls r2, r1, #31 + cmp r3, r2 + blt LLSYM(__f2lz_overflow) + + // Isolate the exponent. + lsrs r2, r0, #24 + +// #if defined(TRAP_NANS) && TRAP_NANS +// // Test for NAN. +// // Otherwise, NAN will be converted like +/-INF. +// cmp r2, #255 +// beq LLSYM(__f2lz_nan) +// #endif + + // Calculate mantissa alignment. Given the implicit '1' in bit[31]: + // * An exponent less than 127 will automatically flush to 0. + // * An exponent of 127 will result in a shift of 63. + // * An exponent of 128 will result in a shift of 62. + // * ... + // * An exponent of 189 will result in a shift of 1. + // * An exponent of 190 will result in no shift at all. + // * An exponent larger than 190 will result in overflow + // (189 in the case of signed integers). + rsbs r2, #0 + adds r2, #190 + // When the shift is less than minimum, the result will overflow. + // The only signed value to fail this test is INT_MIN (0x80000000), + // but it will be returned correctly from the overflow branch. + cmp r2, r1 + blt LLSYM(__f2lz_overflow) + + // Extract the mantissa and restore the implicit '1'. Technically, + // this is wrong for subnormals, but they flush to zero regardless. + lsls r0, #8 + adds r0, #1 + rors r0, r0 + + // Calculate the upper word. + // If the shift is greater than 32, gives an automatic '0'. + movs r1, r0 + lsrs r1, r2 + + // Reduce the shift for the lower word. + // If the original shift was less than 32, the result may be split + // between the upper and lower words. + subs r2, #32 + blt LLSYM(__f2lz_split) + + // Shift is still positive, keep moving right. + lsrs r0, r2 + + // TODO: Remainder test. + // $r1 is technically free, as long as it's zero by the time + // this is over. + + LLSYM(__f2lz_return): + // Two's complement negation, if the original was negative. + eors r0, r3 + eors r1, r3 + subs r0, r3 + sbcs r1, r3 + RET + + LLSYM(__f2lz_split): + // Shift was negative, calculate the remainder + rsbs r2, #0 + lsls r0, r2 + b LLSYM(__f2lz_return) + + LLSYM(__f2lz_zero): + eors r1, r1 + RET + + LLSYM(__f2lz_overflow): + // Positive unsigned integers (r3 == 0, r1 == 0), return 0xFFFFFFFF. + // Negative unsigned integers (r3 == -1, r1 == 0), return 0x00000000. + // Positive signed integers (r3 == 0, r1 == 1), return 0x7FFFFFFF. + // Negative signed integers (r3 == -1, r1 == 1), return 0x80000000. + // TODO: FE_INVALID exception, (but not for -2^63). + mvns r0, r3 + + // For 32-bit results + lsls r2, r1, #26 + lsls r1, #31 + ands r2, r1 + eors r0, r2 + + eors r1, r0 + RET + + CFI_END_FUNCTION +FUNC_END internal_fixsfdi + +#ifdef L_arm_fixsfdi +FUNC_END fixsfdi +FUNC_END aeabi_f2lz +#endif + +#endif /* L_arm_fixsfdi || L_internal_fixsfdi */ + + +#ifdef L_arm_fixunssfdi + +// unsigned long long __aeabi_f2ulz(float) +// Converts a float in $r0 to a 64 bit integer in $r1:$r0, rounding toward 0. +// Values out of range are forced to UINT64_MAX. +// Negative values and NAN all become zero. +// Subsection ordering within fpcore keeps conditional branches within range. +FUNC_START_SECTION aeabi_f2ulz .text.sorted.libgcc.fpcore.u.fixunssfdi +FUNC_ALIAS fixunssfdi aeabi_f2ulz + CFI_START_FUNCTION + + eors r1, r1 + b SYM(__internal_fixsfdi) + + CFI_END_FUNCTION +FUNC_END fixunssfdi +FUNC_END aeabi_f2ulz + +#endif /* L_arm_fixunssfdi */ + diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S index 26737edc6f6..12f39380ac0 100644 --- a/libgcc/config/arm/lib1funcs.S +++ b/libgcc/config/arm/lib1funcs.S @@ -2017,6 +2017,7 @@ LSYM(Lchange_\register): #include "eabi/futil.S" #include "eabi/fmul.S" #include "eabi/fdiv.S" +#include "eabi/ffixed.S" #include "eabi/ffloat.S" #endif /* NOT_ISA_TARGET_32BIT */ #include "eabi/lcmp.S" diff --git a/libgcc/config/arm/t-elf b/libgcc/config/arm/t-elf index 645d20f5f1c..6b0bb642ef5 100644 --- a/libgcc/config/arm/t-elf +++ b/libgcc/config/arm/t-elf @@ -34,6 +34,8 @@ ifeq (__ARM_ARCH_ISA_THUMB 1,$(ARM_ISA)$(THUMB1_ISA)) LIB1ASMFUNCS += \ _internal_cmpsf2 \ _internal_floatundisf \ + _internal_fixsfdi \ + _internal_fixsfsi \ _muldi3 \ _arm_addsf3 \ _arm_floatsisf \ @@ -102,6 +104,8 @@ LIB1ASMFUNCS += \ _arm_frsubsf3 \ _arm_divsf3 \ _arm_floatunsisf \ + _arm_fixsfdi \ + _arm_fixunssfdi \ _fp_exceptionf \ _fp_checknanf \ _fp_assemblef \ From patchwork Fri Jan 15 11:30:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426926 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=cude8ZdG; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=d8yyq03r; dkim-atps=neutral Received: from sourceware.org (unknown [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJwM1H6wz9sXG for ; Fri, 15 Jan 2021 22:33:39 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id EC28C3982423; Fri, 15 Jan 2021 11:32:13 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id 96D9939730CA for ; Fri, 15 Jan 2021 11:32:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 96D9939730CA Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id A66C3EBC; Fri, 15 Jan 2021 06:32:09 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 15 Jan 2021 06:32:09 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=hJUUTeDDolJsy Z8akxra/e7ym+CHFN3HB9vQ/mztc6s=; b=cude8ZdGBlPR5/NgJQh2MWVIY+6rF a/82/rm29138MyVeqfuOv6pfSoJ/vTh1zRCaY0txQ4CiW4Ez4w2aG2XpMSbShkRv JH9xoecJiTlg7DatkPzJu5xppzunWRBgeU54JlER1XgZOVfpvBmltr4RmrW4I5M3 MUoBqT4Jzm7u5GYXqpdCEFbuS3PA6rD+PAtlQ9JoZBgnPm20GWe3ltndiEZ2FDec wWhsnOZsm2lQNdr6e6O/SCL5DFjvFqpHzKo5/qMgDEb0ZqEFaBojiTHHuG7Hv11r Kc/aHtfqqrpwMMvsrYOu9j+uws4354daXfoz2ber1BQeAEqY4Dy42bh/w== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=hJUUTeDDolJsyZ8akxra/e7ym+CHFN3HB9vQ/mztc6s=; b=d8yyq03r 9I5CAGVtVQD/+SBVPTsw/Int7FsA9IgkgSRmcjHHOu1OpIOmyYUSOy00LIl11+Zh ChvOubAGSYMhecT2sRAAwIW+U7PlyZt/M6MQskv0dyaRH+1PEV9xEi1rggF21Mb4 tFLpl9WAzOpTrxE0E73tjiKSCtT0t8aOssZCyA6Wzkb1C/w5y+iP1OFo9ZCWq5Et C1qxVzFBwl6tiyObmQ4ysFzqooKq9JEGXsHqD/JXKs/hh2E9W61C/h4gQgtQLHvh UsP2wKumaQiKnQ65DOg7NcBBlcJN0R8dRywH3rFfOG9rcnZhGtovZVT8Mgpth0g7 2J7K9gpHQP5IfA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddthecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpeekteehleehudfgheehkedugeelfe ehffevuddvkeduffevveehgfdtvdehledtffenucffohhmrghinhepfhgtrghsthdrshgs pdhgnhhurdhorhhgpdhlihgsudhfuhhntghsrdhssgenucfkphepjedurdefiedruddttd drvddvtdenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhm pehgnhhusegurghnihgvlhgvnhhgvghlrdgtohhm X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id C1EA0108005B; Fri, 15 Jan 2021 06:32:08 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBW7A9023784; Fri, 15 Jan 2021 03:32:07 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 31/33] Import float<->double conversion from the CM0 library Date: Fri, 15 Jan 2021 03:30:59 -0800 Message-Id: <733ecf7ed82b2cdbb6b5eba2d1532d4a18121c4a.1610709584.git.gnu@danielengel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-13.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/eabi/fcast.S (__aeabi_d2f, __aeabi_f2d): New file. * config/arm/lib1funcs.S: #include eabi/fcast.S (v6m only). * config/arm/t-elf (LIB1ASMFUNCS): Added _arm_d2f and _arm_f2d. --- libgcc/config/arm/eabi/fcast.S | 256 +++++++++++++++++++++++++++++++++ libgcc/config/arm/lib1funcs.S | 1 + libgcc/config/arm/t-elf | 2 + 3 files changed, 259 insertions(+) create mode 100644 libgcc/config/arm/eabi/fcast.S diff --git a/libgcc/config/arm/eabi/fcast.S b/libgcc/config/arm/eabi/fcast.S new file mode 100644 index 00000000000..b1184ee1d53 --- /dev/null +++ b/libgcc/config/arm/eabi/fcast.S @@ -0,0 +1,256 @@ +/* fcast.S: Thumb-1 optimized 32- and 64-bit float conversions + + Copyright (C) 2018-2021 Free Software Foundation, Inc. + Contributed by Daniel Engel, Senva Inc (gnu@danielengel.com) + + This file is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the + Free Software Foundation; either version 3, or (at your option) any + later version. + + This file is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + + +#ifdef L_arm_f2d + +// double __aeabi_f2d(float) +// Converts a single-precision float in $r0 to double-precision in $r1:$r0. +// Rounding, overflow, and underflow are impossible. +// INF and ZERO are returned unmodified. +FUNC_START_SECTION aeabi_f2d .text.sorted.libgcc.fpcore.v.f2d +FUNC_ALIAS extendsfdf2 aeabi_f2d + CFI_START_FUNCTION + + // Save the sign. + lsrs r1, r0, #31 + lsls r1, #31 + + // Set up registers for __fp_normalize2(). + push { rT, lr } + .cfi_remember_state + .cfi_adjust_cfa_offset 8 + .cfi_rel_offset rT, 0 + .cfi_rel_offset lr, 4 + + // Test for zero. + lsls r0, #1 + beq LLSYM(__f2d_return) + + // Split the exponent and mantissa into separate registers. + // This is the most efficient way to convert subnormals in the + // half-precision form into normals in single-precision. + // This does add a leading implicit '1' to INF and NAN, + // but that will be absorbed when the value is re-assembled. + movs r2, r0 + bl SYM(__fp_normalize2) __PLT__ + + // Set up the exponent bias. For INF/NAN values, the bias + // is 1791 (2047 - 255 - 1), where the last '1' accounts + // for the implicit '1' in the mantissa. + movs r0, #3 + lsls r0, #9 + adds r0, #255 + + // Test for INF/NAN, promote exponent if necessary + cmp r2, #255 + beq LLSYM(__f2d_indefinite) + + // For normal values, the exponent bias is 895 (1023 - 127 - 1), + // which is half of the prepared INF/NAN bias. + lsrs r0, #1 + + LLSYM(__f2d_indefinite): + // Assemble exponent with bias correction. + adds r2, r0 + lsls r2, #20 + adds r1, r2 + + // Assemble the high word of the mantissa. + lsrs r0, r3, #11 + add r1, r0 + + // Remainder of the mantissa in the low word of the result. + lsls r0, r3, #21 + + LLSYM(__f2d_return): + pop { rT, pc } + .cfi_restore_state + + CFI_END_FUNCTION +FUNC_END extendsfdf2 +FUNC_END aeabi_f2d + +#endif /* L_arm_f2d */ + + +#if defined(L_arm_d2f) || defined(L_arm_truncdfsf2) + +// HACK: Build two separate implementations: +// * __aeabi_d2f() rounds to nearest per traditional IEEE-753 rules. +// * __truncdfsf2() rounds towards zero per GCC specification. +// Presumably, a program will consistently use one ABI or the other, +// which means that code size will not be duplicated in practice. +// Merging two versions with dynamic rounding would be rather hard. +#ifdef L_arm_truncdfsf2 + #define D2F_NAME truncdfsf2 + #define D2F_SECTION .text.sorted.libgcc.fpcore.x.truncdfsf2 +#else + #define D2F_NAME aeabi_d2f + #define D2F_SECTION .text.sorted.libgcc.fpcore.w.d2f +#endif + +// float __aeabi_d2f(double) +// Converts a double-precision float in $r1:$r0 to single-precision in $r0. +// Values out of range become ZERO or INF; returns the upper 23 bits of NAN. +FUNC_START_SECTION D2F_NAME D2F_SECTION + CFI_START_FUNCTION + + // Save the sign. + lsrs r2, r1, #31 + lsls r2, #31 + mov ip, r2 + + // Isolate the exponent (11 bits). + lsls r2, r1, #1 + lsrs r2, #21 + + // Isolate the mantissa. It's safe to always add the implicit '1' -- + // even for subnormals -- since they will underflow in every case. + lsls r1, #12 + adds r1, #1 + rors r1, r1 + lsrs r3, r0, #21 + adds r1, r3 + + #ifndef L_arm_truncdfsf2 + // Fix the remainder. Even though the mantissa already has 32 bits + // of significance, this value still influences rounding ties. + lsls r0, #11 + #endif + + // Test for INF/NAN (r3 = 2047) + mvns r3, r2 + lsrs r3, #21 + cmp r3, r2 + beq LLSYM(__d2f_indefinite) + + // Adjust exponent bias. Offset is 127 - 1023, less 1 more since + // __fp_assemble() expects the exponent relative to bit[30]. + lsrs r3, #1 + subs r2, r3 + adds r2, #126 + + #ifndef L_arm_truncdfsf2 + LLSYM(__d2f_overflow): + // Use the standard formatting for overflow and underflow. + push { rT, lr } + .cfi_remember_state + .cfi_adjust_cfa_offset 8 + .cfi_rel_offset rT, 0 + .cfi_rel_offset lr, 4 + + b SYM(__fp_assemble) + .cfi_restore_state + + #else /* L_arm_truncdfsf2 */ + // In theory, __truncdfsf2() could also push registers and branch to + // __fp_assemble() after calculating the truncation shift and clearing + // bits. __fp_assemble() always rounds down if there is no remainder. + // However, after doing all of that work, the incremental cost to + // finish assembling the return value is only 6 or 7 instructions + // (depending on how __d2f_overflow() returns). + // This seems worthwhile to avoid linking in all of __fp_assemble(). + + // Test for INF. + cmp r2, #254 + bge LLSYM(__d2f_overflow) + + #if defined(FP_EXCEPTIONS) && FP_EXCEPTIONS + // Preserve inexact zero. + orrs r0, r1 + #endif + + // HACK: Pre-empt the default round-to-nearest mode, + // since GCC specifies rounding towards zero. + // Start by identifying subnormals by negative exponents. + asrs r3, r2, #31 + ands r3, r2 + + // Clear the exponent field if the result is subnormal. + eors r2, r3 + + // Add the subnormal shift to the nominal 8 bits of standard remainder. + // Also, saturate the low byte if the shift is larger than 32 bits. + // Anything larger would flush to zero anyway, and the shift + // innstructions only examine the low byte of the second operand. + // Basically: + // x = (-x + 8 > 32) ? 255 : (-x + 8) + // x = (x + 24 < 0) ? 255 : (-x + 8) + // x = (x + 24 < 0) ? 255 : (-(x + 24) + 32) + adds r3, #24 + asrs r0, r3, #31 + subs r3, #32 + rsbs r3, #0 + orrs r3, r0 + + // Clear the insignificant bits. + lsrs r1, r3 + + // Combine the mantissa and the exponent. + lsls r2, #23 + adds r0, r1, r2 + + // Combine with the saved sign. + add r0, ip + RET + + LLSYM(__d2f_overflow): + // Construct signed INF in $r0. + movs r0, #255 + lsls r0, #23 + add r0, ip + RET + + #endif /* L_arm_truncdfsf2 */ + + LLSYM(__d2f_indefinite): + // Test for INF. If the mantissa, exclusive of the implicit '1', + // is equal to '0', the result will be INF. + lsls r3, r1, #1 + orrs r3, r0 + beq LLSYM(__d2f_overflow) + + // TODO: Support for TRAP_NANS here. + // This will be double precision, not compatible with the current handler. + + // Construct NAN with the upper 22 bits of the mantissa, setting bit[21] + // to ensure a valid NAN without changing bit[22] (quiet) + subs r2, #0xD + lsls r0, r2, #20 + lsrs r1, #8 + orrs r0, r1 + + #if defined(STRICT_NANS) && STRICT_NANS + // Yes, the NAN was probably altered, but at least keep the sign... + add r0, ip + #endif + + RET + + CFI_END_FUNCTION +FUNC_END D2F_NAME + +#endif /* L_arm_d2f || L_arm_truncdfsf2 */ + diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S index 12f39380ac0..5148957144b 100644 --- a/libgcc/config/arm/lib1funcs.S +++ b/libgcc/config/arm/lib1funcs.S @@ -2019,6 +2019,7 @@ LSYM(Lchange_\register): #include "eabi/fdiv.S" #include "eabi/ffixed.S" #include "eabi/ffloat.S" +#include "eabi/fcast.S" #endif /* NOT_ISA_TARGET_32BIT */ #include "eabi/lcmp.S" #endif /* !__symbian__ */ diff --git a/libgcc/config/arm/t-elf b/libgcc/config/arm/t-elf index 6b0bb642ef5..434a7a85598 100644 --- a/libgcc/config/arm/t-elf +++ b/libgcc/config/arm/t-elf @@ -106,6 +106,8 @@ LIB1ASMFUNCS += \ _arm_floatunsisf \ _arm_fixsfdi \ _arm_fixunssfdi \ + _arm_d2f \ + _arm_f2d \ _fp_exceptionf \ _fp_checknanf \ _fp_assemblef \ From patchwork Fri Jan 15 11:31:00 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426927 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=WUj9ktby; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=TfSj8PsX; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJwQ6ByDz9sWj for ; Fri, 15 Jan 2021 22:33:42 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CF2CB3985465; Fri, 15 Jan 2021 11:32:15 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id B1D6B39730CA for ; Fri, 15 Jan 2021 11:32:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org B1D6B39730CA Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id C85A0F7E; Fri, 15 Jan 2021 06:32:11 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 15 Jan 2021 06:32:12 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=k5XVaJY4ua68x /WqDYSUKs9r0or7oFvbpW8fccz+Cd8=; b=WUj9ktbyN7ngg51zfot2OgB2uz6jc V9yxq8lxkHluU0eFRrde7i0zAU2/PwVDfs06BCV52CsER76BmCMj8NW1q3eZOQon W4SEaIcPpCchuDwFk74tOK8H5TVDTIMAggJYt0RjYSRu11YWH6aHO2iGjlUOOIOm OExIm+9e3E5ZQ6JzYAUBIqP/CzRqVpvuZ5xO96ceflD6FDsOHfdTnh/CWjbHo3sy sz7aXq8MEmbMobD0Z9KfkVvA60rXdv0eTEh6Rz8htenvZJA/n/JEnYxdm+ivZvN9 5I3srzSAC7WgDeK8S/wCGcir/DhhCHiy2zwGJvk+KXffrS2GTD6iNlIAw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=k5XVaJY4ua68x/WqDYSUKs9r0or7oFvbpW8fccz+Cd8=; b=TfSj8PsX xd3QvpO6Rqp8YUEp8jfT8PjkoqYCFyBLEd1vCF2csHa1nPEn7eC55c5/jN4gZgmJ X1KDQ1xJ/icSZ1u7eMTWxkKfC1XIedJ5SeefTau40Y6zuOL6sM97NKpLydDK8tcQ AinZM8CfLmADtIPii3Z20Iq/jGkFNsDTAFKRF3DKU/Dxtwj1e/CdskOb+3C/xMwK 5THefm5W7/83sZaYAq138B5teYf2cl33LTolkkfjsfjYJjClm3bcRjBfZChXU9BO zlUz85GfpPYYcALa0+YGiuXsvNDdPBG8tzqQTjJyclwzU01bHXodb6OanjIeNGYz tlngYH1beUufgA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddthecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpeeugeeuudekffffhffhkeetheekge elffefgeffleehgeehteektddtieelheegffenucffohhmrghinhepfhgtrghsthdrshgs necukfhppeejuddrfeeirddutddtrddvvddtnecuvehluhhsthgvrhfuihiivgeptdenuc frrghrrghmpehmrghilhhfrhhomhepghhnuhesuggrnhhivghlvghnghgvlhdrtghomh X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id E5E741080066; Fri, 15 Jan 2021 06:32:10 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBW971023787; Fri, 15 Jan 2021 03:32:09 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 32/33] Import float<->__fp16 conversion from the CM0 library Date: Fri, 15 Jan 2021 03:31:00 -0800 Message-Id: <11902122f4c7854d5b40ee584984f0dacabcb2f9.1610709584.git.gnu@danielengel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-13.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/eabi/fcast.S (__aeabi_h2f, __aeabi_f2h): Added functions. * config/arm/fp16 (__gnu_f2h_ieee, __gnu_h2f_ieee, __gnu_f2h_alternative, __gnu_h2f_alternative): Disable build for v6m multilibs. * config/arm/t-bpabi (LIB1ASMFUNCS): Added _aeabi_f2h_ieee, _aeabi_h2f_ieee, _aeabi_f2h_alt, and _aeabi_h2f_alt (v6m only). --- libgcc/config/arm/eabi/fcast.S | 277 +++++++++++++++++++++++++++++++++ libgcc/config/arm/fp16.c | 4 + libgcc/config/arm/t-bpabi | 7 + 3 files changed, 288 insertions(+) diff --git a/libgcc/config/arm/eabi/fcast.S b/libgcc/config/arm/eabi/fcast.S index b1184ee1d53..e5a34d69578 100644 --- a/libgcc/config/arm/eabi/fcast.S +++ b/libgcc/config/arm/eabi/fcast.S @@ -254,3 +254,280 @@ FUNC_END D2F_NAME #endif /* L_arm_d2f || L_arm_truncdfsf2 */ + +#if defined(L_aeabi_h2f_ieee) || defined(L_aeabi_h2f_alt) + +#ifdef L_aeabi_h2f_ieee + #define H2F_NAME aeabi_h2f + #define H2F_ALIAS gnu_h2f_ieee +#else + #define H2F_NAME aeabi_h2f_alt + #define H2F_ALIAS gnu_h2f_alternative +#endif + +// float __aeabi_h2f(short hf) +// float __aeabi_h2f_alt(short hf) +// Converts a half-precision float in $r0 to single-precision. +// Rounding, overflow, and underflow conditions are impossible. +// In IEEE mode, INF, ZERO, and NAN are returned unmodified. +FUNC_START_SECTION H2F_NAME .text.sorted.libgcc.h2f +FUNC_ALIAS H2F_ALIAS H2F_NAME + CFI_START_FUNCTION + + // Set up registers for __fp_normalize2(). + push { rT, lr } + .cfi_remember_state + .cfi_adjust_cfa_offset 8 + .cfi_rel_offset rT, 0 + .cfi_rel_offset lr, 4 + + // Save the mantissa and exponent. + lsls r2, r0, #17 + + // Isolate the sign. + lsrs r0, #15 + lsls r0, #31 + + // Align the exponent at bit[24] for normalization. + // If zero, return the original sign. + lsrs r2, #3 + + #ifdef __HAVE_FEATURE_IT + do_it eq + RETc(eq) + #else + beq LLSYM(__h2f_return) + #endif + + // Split the exponent and mantissa into separate registers. + // This is the most efficient way to convert subnormals in the + // half-precision form into normals in single-precision. + // This does add a leading implicit '1' to INF and NAN, + // but that will be absorbed when the value is re-assembled. + bl SYM(__fp_normalize2) __PLT__ + + #ifdef L_aeabi_h2f_ieee + // Set up the exponent bias. For INF/NAN values, the bias is 223, + // where the last '1' accounts for the implicit '1' in the mantissa. + adds r2, #(255 - 31 - 1) + + // Test for INF/NAN. + cmp r2, #254 + + #ifdef __HAVE_FEATURE_IT + do_it ne + #else + beq LLSYM(__h2f_assemble) + #endif + + // For normal values, the bias should have been 111. + // However, this offset must be adjusted per the INF check above. + IT(sub,ne) r2, #((255 - 31 - 1) - (127 - 15 - 1)) + + #else /* L_aeabi_h2f_alt */ + // Set up the exponent bias. All values are normal. + adds r2, #(127 - 15 - 1) + #endif + + LLSYM(__h2f_assemble): + // Combine exponent and sign. + lsls r2, #23 + adds r0, r2 + + // Combine mantissa. + lsrs r3, #8 + add r0, r3 + + LLSYM(__h2f_return): + pop { rT, pc } + .cfi_restore_state + + CFI_END_FUNCTION +FUNC_END H2F_NAME +FUNC_END H2F_ALIAS + +#endif /* L_aeabi_h2f_ieee || L_aeabi_h2f_alt */ + + +#if defined(L_aeabi_f2h_ieee) || defined(L_aeabi_f2h_alt) + +#ifdef L_aeabi_f2h_ieee + #define F2H_NAME aeabi_f2h + #define F2H_ALIAS gnu_f2h_ieee +#else + #define F2H_NAME aeabi_f2h_alt + #define F2H_ALIAS gnu_f2h_alternative +#endif + +// short __aeabi_f2h(float f) +// short __aeabi_f2h_alt(float f) +// Converts a single-precision float in $r0 to half-precision, +// rounding to nearest, ties to even. +// Values out of range are forced to either ZERO or INF. +// In IEEE mode, the upper 12 bits of a NAN will be preserved. +FUNC_START_SECTION F2H_NAME .text.sorted.libgcc.f2h +FUNC_ALIAS F2H_ALIAS F2H_NAME + CFI_START_FUNCTION + + // Set up the sign. + lsrs r2, r0, #31 + lsls r2, #15 + + // Save the exponent and mantissa. + // If ZERO, return the original sign. + lsls r0, #1 + + #ifdef __HAVE_FEATURE_IT + do_it ne,t + addne r0, r2 + RETc(ne) + #else + beq LLSYM(__f2h_return) + #endif + + // Isolate the exponent. + lsrs r1, r0, #24 + + #ifdef L_aeabi_f2h_ieee + // Check for NAN. + cmp r1, #255 + beq LLSYM(__f2h_indefinite) + + // Check for overflow. + cmp r1, #(127 + 15) + bhi LLSYM(__f2h_overflow) + + #else /* L_aeabi_f2h_alt */ + // Detect overflow. + subs r1, #(127 + 16) + rsbs r3, r1, $0 + asrs r3, #31 + + // Saturate the mantissa on overflow. + bics r0, r3 + lsrs r3, #17 + orrs r0, r3 + bcs LLSYM(__f2h_return) + + #endif /* L_aeabi_f2h_alt */ + + // Isolate the mantissa, adding back the implicit '1'. + lsls r0, #8 + adds r0, #1 + rors r0, r0 + + // Adjust exponent bias for half-precision, including '1' to + // account for the mantissa's implicit '1'. + #ifdef L_aeabi_f2h_ieee + subs r1, #(127 - 15 + 1) + #else + adds r1, #((127 + 16) - (127 - 15 + 1)) + #endif + + bmi LLSYM(__f2h_underflow) + + // This next part is delicate. The rouncing check requires a scratch + // register, but the sign can't be merged in until after the final + // overflow check below. Prepare the exponent. + // The mantissa and exponent can be combined, but the exponent + // must be prepared now while the flags don't matter. + lsls r1, #10 + + // Split the mantissa (11 bits) and remainder (13 bits). + lsls r3, r0, #12 + lsrs r0, #21 + + // Combine mantissa and exponent without affecting flags. + add r0, r1 + + LLSYM(__f2h_round): + // If the carry bit is '0', always round down. + #ifdef __HAVE_FEATURE_IT + do_it cs,t + addcs r0, r2 + RETc(cs) + #else + bcc LLSYM(__f2h_return) + #endif + + // Carry was set. If a tie (no remainder) and the + // LSB of the result is '0', round down (to even). + lsls r1, r0, #31 + orrs r1, r3 + + #ifdef __HAVE_FEATURE_IT + do_it ne + #else + beq LLSYM(__f2h_return) + #endif + + // Round up, ties to even. + IT(add,ne) r0, #1 + + #ifndef L_aeabi_f2h_ieee + // HACK: The result may overflow to -0 not INF in alt mode. + // Subtract overflow to reverse. + lsrs r3, r0, #15 + subs r0, r3 + #endif + + LLSYM(__f2h_return): + // Combine mantissa and exponent with the sign. + adds r0, r2 + RET + + LLSYM(__f2h_underflow): + // Align the remainder. The remainder consists of the last 12 bits + // of the mantissa plus the magnitude of underflow. + movs r3, r0 + adds r1, #12 + lsls r3, r1 + + // Align the mantissa. The MSB of the remainder must be + // shifted out last into the 'C' flag for rounding. + subs r1, #33 + rsbs r1, #0 + lsrs r0, r1 + b LLSYM(__f2h_round) + + #ifdef L_aeabi_f2h_ieee + LLSYM(__f2h_overflow): + // Create single-precision INF from which to construct half-precision. + movs r0, #255 + lsls r0, #24 + + LLSYM(__f2h_indefinite): + // Check for INF. + lsls r3, r0, #8 + + #ifdef __HAVE_FEATURE_IT + do_it ne,t + #else + beq LLSYM(__f2h_infinite) + #endif + + // HACK: The ARM specification states "the least significant 13 bits + // of a NAN are lost in the conversion." But what happens when the + // NAN-ness of the value resides in these 13 bits? + // Set bit[8] to ensure NAN without changing bit[9] (quiet). + IT(add,ne) r2, #128 + IT(add,ne) r2, #128 + + LLSYM(__f2h_infinite): + // Construct the result from the upper 11 bits of the mantissa + // and the lower 5 bits of the exponent. + lsls r0, #3 + lsrs r0, #17 + + // Combine with the sign (and possibly NAN flag). + orrs r0, r2 + RET + + #endif /* L_aeabi_f2h_ieee */ + + CFI_END_FUNCTION +FUNC_END F2H_NAME +FUNC_END F2H_ALIAS + +#endif /* L_aeabi_f2h_ieee || L_aeabi_f2h_alt */ + diff --git a/libgcc/config/arm/fp16.c b/libgcc/config/arm/fp16.c index db628ed1de4..f0e72385fbd 100644 --- a/libgcc/config/arm/fp16.c +++ b/libgcc/config/arm/fp16.c @@ -198,6 +198,8 @@ __gnu_h2f_internal(unsigned short a, int ieee) return sign | (((aexp + 0x70) << 23) + (mantissa << 13)); } +#if (__ARM_ARCH_ISA_ARM) || (__ARM_ARCH_ISA_THUMB > 1) + unsigned short __gnu_f2h_ieee(unsigned int a) { @@ -222,6 +224,8 @@ __gnu_h2f_alternative(unsigned short a) return __gnu_h2f_internal(a, 0); } +#endif /* NOT_ISA_TARGET_32BIT */ + unsigned short __gnu_d2h_ieee (unsigned long long a) { diff --git a/libgcc/config/arm/t-bpabi b/libgcc/config/arm/t-bpabi index 86234d5676f..1b1ecfc638e 100644 --- a/libgcc/config/arm/t-bpabi +++ b/libgcc/config/arm/t-bpabi @@ -1,6 +1,13 @@ # Add the bpabi.S functions. LIB1ASMFUNCS += _aeabi_lcmp _aeabi_ulcmp _aeabi_ldivmod _aeabi_uldivmod +# Only enabled for v6m. +ARM_ISA:=$(findstring __ARM_ARCH_ISA_ARM,$(shell $(gcc_compile_bare) -dM -E - X-Patchwork-Id: 1426928 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=Ie1FkVGc; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=WySxWtz+; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJwW011fz9sWp for ; Fri, 15 Jan 2021 22:33:46 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 447A93982421; Fri, 15 Jan 2021 11:32:18 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id A5D4239730CA for ; Fri, 15 Jan 2021 11:32:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org A5D4239730CA Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id C0004EBC; Fri, 15 Jan 2021 06:32:13 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 15 Jan 2021 06:32:14 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=iZRBoBgEivGVz MRFEtIzkCj6RKreModR0zZpbwKfGNw=; b=Ie1FkVGc89i4ZO95rrFf2GW45+39k ZIygc5SdNu2IPlt5DbE+uSuoy23/lvWKgfTR99nb5RiHSVtnLSSJHVKfhLC7AoP2 6BhsTrSmibvFD1F4I31x5bY3TZe/9E6Y3v3Yd0g247nZqV4o4bGIkVk7Iuv9/Ex6 KH3WpUeMmaIO2jGJ53r3a8pt6ySSX5lp6I3UZrGFPyUOCQk6LU9s2EhNXJf/XyLk sjFX4iGYL4FbyHu9HZEvw35lMGyUBTO38hewvBHcu7+17iqYPOSXYLilKahzgAPa SIbqadncatkJSJ/+/x8fL1Zy3tbSIDJZM3xL81Y/ia45f7xPgBZSZvP+w== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=iZRBoBgEivGVzMRFEtIzkCj6RKreModR0zZpbwKfGNw=; b=WySxWtz+ HRZkjNJOhUFJJcld6iBkxTwTkDbcJaVV1229nXS0VAB3sGUYtMblyWb6jAEertE+ HZWkWfZiMnAmegtDvYcMo1v2pp3fAj6a4aSxWWDdpvPocJOreuJ0jPePegQM1yZX frzzPkWuT7zYEpqRVQtKvgUuLvnPz8HLXJNyszecrFMXYtN+Pvt3xKfZwhYVwQVd xqC9zA18pVuPmsuvTSYrf4JdPMBxXJXLLzFlQnv2+dPAXIwBxVyG+8JZQ2QSxsds XtaDMqzLeuUQULW6knJOujQGEEJ6Uk2AwnK1E3N9mpSJoWQbrP+8nbVaKPXFT5fT 89xdSu23Gv0iRA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddthecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpeekfedugfffkeehleehtdelhfekke ejhfehteehfefhffethfeivdeuhfeijedtgeenucfkphepjedurdefiedruddttddrvddv tdenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehgnh husegurghnihgvlhgvnhhgvghlrdgtohhm X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id ECAD91080057; Fri, 15 Jan 2021 06:32:12 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBWBID023790; Fri, 15 Jan 2021 03:32:11 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 33/33] Drop single-precision Thumb-1 soft-float functions Date: Fri, 15 Jan 2021 03:31:01 -0800 Message-Id: <907750a524e830db4d0bdf58eb212bf50a3d12cc.1610709584.git.gnu@danielengel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-13.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" With the complete CM0 library integrated, regression testing showed new failures with the message "compilation failed to produce executable": gcc.dg/fixed-point/convert-float-1.c gcc.dg/fixed-point/convert-float-3.c gcc.dg/fixed-point/convert-sat.c Investigating, this appears to be caused by the linker. I can't find a comprehensive linker specification to claim this is actually a bug, but it certainly doesn't match my expectations. Investigating, I found issues with the link order of these symbols: * __aeabi_fmul() * __aeabi_f2d() * __aeabi_f2iz() Specifically, I expect the linker to import the _first_ definition of any symbol. This is the basic behavior that allows the soft-float library to supply missing symbols on architectures without optimized routines. Comparing the v6-m multilib with the default, I see symbol exports for all of the affect symbols: gcc-obj/gcc/libgcc.a: // assembly routines _arm_mulsf3.o: 00000000 W __aeabi_fmul 00000000 W __mulsf3 _arm_addsubdf3.o: 00000368 T __aeabi_f2d 00000368 T __extendsfdf2 _arm_fixsfsi.o: 00000000 T __aeabi_f2iz 00000000 T __fixsfsi mulsf3.o: fixsfsi.o: extendsfdf2.o.o: gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a: // assembly routines _arm_mulsf3.o: 00000000 T __aeabi_fmul U __fp_assemble U __fp_exception U __fp_infinity U __fp_zero 00000000 T __mulsf3 U __umulsidi3 _arm_fixsfsi.o: 00000000 T __aeabi_f2iz 00000000 T __fixsfsi 00000002 T __internal_f2iz _arm_f2d.o: 00000000 T __aeabi_f2d 00000000 T __extendsfdf2 U __fp_normalize2 // soft-float library mulsf3.o: 00000000 T __aeabi_fmul fixsfsi.o: 00000000 T __aeabi_f2iz extendsfdf2.o: 00000000 T __aeabi_f2d Given the order of the archive file, I expect the linker to import the affected functions from the _arm_* archive elements. For "convert-sat.c", all is well with -march=armv7-m. ... (/home/mirdan/gcc-obj/gcc/libgcc.a)_arm_muldf3.o OK> (/home/mirdan/gcc-obj/gcc/libgcc.a)_arm_mulsf3.o (/home/mirdan/gcc-obj/gcc/libgcc.a)_arm_cmpsf2.o (/home/mirdan/gcc-obj/gcc/libgcc.a)_arm_fixsfsi.o (/home/mirdan/gcc-obj/gcc/libgcc.a)_arm_fixunssfsi.o OK> (/home/mirdan/gcc-obj/gcc/libgcc.a)_arm_addsubdf3.o (/home/mirdan/gcc-obj/gcc/libgcc.a)_arm_cmpdf2.o (/home/mirdan/gcc-obj/gcc/libgcc.a)_arm_fixdfsi.o (/home/mirdan/gcc-obj/gcc/libgcc.a)_arm_fixunsdfsi.o OK> (/home/mirdan/gcc-obj/gcc/libgcc.a)_fixsfdi.o (/home/mirdan/gcc-obj/gcc/libgcc.a)_fixdfdi.o (/home/mirdan/gcc-obj/gcc/libgcc.a)_fixunssfdi.o (/home/mirdan/gcc-obj/gcc/libgcc.a)_fixunsdfdi.o ... However, with -march=armv6s-m, the linker imports these symbols from the soft- float library. (NOTE: The CM0 library only implements single-precision float operations, so imports from muldf3.o, fixdfsi.o, etc are expected.) ... ??> (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)mulsf3.o ??> (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)fixsfsi.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)muldf3.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)fixdfsi.o ??> (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)extendsfdf2.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)_clzsi2.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)_arm_fcmpge.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)_arm_fcmple.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)_fixsfdi.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)_fixunssfdi.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)_fixunssfsi.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)_arm_cmpdf2.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)_fixunsdfsi.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)_fixdfdi.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)_fixunsdfdi.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)eqdf2.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)gedf2.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)ledf2.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)subdf3.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)floatunsidf.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)_arm_cmpsf2.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)_fixsfsi.o ... It seems that the order in which the linker resolves symbols matters. In the affected test cases, the linker begins searching for fixed-point function symbols first: _subQQ.o, _cmpQQ.o, etc. The fixed-point archive elements appear after the _arm_* archive elements, so the initial definitions of the floating point functions are discarded. However, the fixed-point functions contain unresolved symbol references which the linker registers progressively. Given that the default libgcc.a does not build the soft-point library [1], the linker cannot import any floating point objects until the second pass. However, when v6-m/nofp/libgcc.a _does_ include the soft-point library, the linker proceeds to import some floating point objects during the first pass. To test this theory, add explicit symbol references to convert-sat.c: --- a/gcc/testsuite/gcc.dg/fixed-point/convert-sat.c +++ b/gcc/testsuite/gcc.dg/fixed-point/convert-sat.c @@ -11,6 +11,12 @@ extern void abort (void); int main () { + volatile float a = 1.0; + volatile float b = 2.0; + volatile float c = a * b; + volatile double d = a; + volatile int e = a; + SAT_CONV1 (short _Accum, hk); SAT_CONV1 (_Accum, k); SAT_CONV1 (long _Accum, lk); Afterwards, the linker imports the expected symbols: ... ==> (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)_arm_mulsf3.o ==> (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)_muldi3.o ==> (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)_arm_fixsfsi.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)_arm_f2d.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)_fp_exceptionf.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)_fp_assemblef.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)_fp_normalizef.o ... (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)muldf3.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)fixdfsi.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)_clzsi2.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)_arm_fixunssfsi.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)_arm_fcmpge.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)_arm_fcmple.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)_arm_fixsfdi.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)_arm_fixunssfdi.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)_arm_cmpdf2.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)_fixunsdfsi.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)_fixdfdi.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)_fixunsdfdi.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)eqdf2.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)gedf2.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)ledf2.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)subdf3.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)floatunsidf.o (/home/mirdan/gcc-obj/gcc/thumb/v6-m/nofp/libgcc.a)_arm_cmpsf2.o ... At a minimum this behavior results in the use of non-preferred code in an affected application. However, as long as each object exports a single entry point, this does not automatically result in a build failure. Indeed, in the case of __aeabi_fmul() and __aeabi_f2d(), all references seem to resolve uniformly in favor of the soft-float library. The first pass that imports the soft-float version of __aeabi_f2iz() also succeeds. However, the first pass fails to find __aeabi_f2uiz(), since the soft-float library does not implement this variant. So, this symbol remains undefined until the second pass. However, the assembly version of __aeabi_f2uiz() the linker finds happens to be implemented as a branch to __internal_f2iz() [2]. But the linker, importing __internal_f2iz(), also finds the main entry point __aeabi_f2iz(). And, since __aeabi_f2iz() was already found in the soft-float library, the linker throws an error. The solution is two-fold. First, the assembly routines have separately been made robust against this potential error condition (by weakening and splitting symbols). Second, this commit to block single-precision functions from the soft-float library makes it impossible for the linker to select a non-preferred version. Two duplicate symbols remain (extendsfdf2) and (truncdfsf2), but the situation is much improved. [1] softfp_wrap_start = "#if !__ARM_ARCH_ISA_ARM && __ARM_ARCH_ISA_THUMB == 1" [2] (These operations share a substantial portion of their code path, so this choice leads to a size reduction in programs that use both functions.) gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/t-softfp (softfp_float_modes): Added as "df". --- libgcc/config/arm/t-softfp | 2 ++ 1 file changed, 2 insertions(+) diff --git a/libgcc/config/arm/t-softfp b/libgcc/config/arm/t-softfp index 554ec9bc47b..bd6a4642e5f 100644 --- a/libgcc/config/arm/t-softfp +++ b/libgcc/config/arm/t-softfp @@ -1,2 +1,4 @@ softfp_wrap_start := '\#if !__ARM_ARCH_ISA_ARM && __ARM_ARCH_ISA_THUMB == 1' softfp_wrap_end := '\#endif' +softfp_float_modes := df +