Message ID | cover.1610363037.git.gnu@danielengel.com |
---|---|
Headers | show
Return-Path: <gcc-patches-bounces@gcc.gnu.org> X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=<UNKNOWN>) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=PI/kVtWj; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=hqzbfnHs; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DDrcm5Mxzz9sW8 for <incoming@patchwork.ozlabs.org>; Mon, 11 Jan 2021 22:11:36 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E6942386F466; Mon, 11 Jan 2021 11:11:19 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from out1-smtp.messagingengine.com (out1-smtp.messagingengine.com [66.111.4.25]) by sourceware.org (Postfix) with ESMTPS id 733F23861810 for <gcc-patches@gcc.gnu.org>; Mon, 11 Jan 2021 11:11:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 733F23861810 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 7179A5C0151; Mon, 11 Jan 2021 06:11:13 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Mon, 11 Jan 2021 06:11:13 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; s=fm1; bh=VETgg113rX4Jbrd0ISf19976H6 ne8OSK4n4N2YqAdLc=; b=PI/kVtWjQJ+n6/xWOnIR0W5C806W1OXMRyUtNfz4X2 Ft+jhPdjUAqO6o9EMcobBHpETRg9PxnPTBCJSMGaOqAGPG2HcPdN9oWF+D9dMEfo ztRAyWw0k6ertgvuQlZD8V/7LkKHOr/TMc5f9joUDwanvinsA8ntzrwACqzssOB3 AZIXb2d5Yw8Aes02zbi2kE0ghwIJdGleL5x6WcOXzD1ttMC6yGqSqImo2qyOVM5E M/UzBH9K3uscLHNBj1gtSqJzRRGbVsGasSeWusB1NnFBPh0Xf0jAPHd8eg5zCmie WBhdrLCAHh+Wypd61Jg9go7LDOthh8zVoxGhqg8+iJIg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :message-id:mime-version:subject:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm1; bh=VETgg113rX4Jbrd0I Sf19976H6ne8OSK4n4N2YqAdLc=; b=hqzbfnHsY+49I+MMIn5XgFUYXd5UgSUEv 6ZhoZ7+YZjoXVcecU7QepLFcq+JOZNp3sTmGlSYxOLVQfvHc9UTCItN+KSzv/jky FaZh7X99Eo1QKjixDKE/DyL/F2vClVhL9FOx4Oj3STQr7xp8roxB5HWmzR5NK4qM p1siuR2l5d8MS2gSj5P3ML+DXSJtt30yuvgt9wabZWqsL1rhyY0OokuQuHtftcHo jJEswcBeIzoBn5JxcaUVoSCqyzB492kCS0yWS8jFdE2HXL0BzbHO/SwAuPJBQL2N ZfGbRVf8+BZZ8MU9qhduvSVBLrHd5bU3JCunQCZGS1IsQg6TEaaVw== X-ME-Sender: <xms:UDL8X00g-3WzcARyBr1Lrl7YFlLeiPJ-y9eEby20mE_ZN6R0Ziz5Zg> <xme:UDL8X_EK5K9at_5Xpx_DHjOgZtLKSfumtKAs2640rt1IGJnSiRfHMRnR-U0DL-me_ ZwZEfshXuqL8w> X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrvdehuddgvdehucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpefhvffufffkofgggfestdekredtre dttdenucfhrhhomhepghhnuhesuggrnhhivghlvghnghgvlhdrtghomhenucggtffrrght thgvrhhnpedthfevkedtleelueeiffeitdeiveetvddtudekjeekgeetvdektdegueejhe fggfenucffohhmrghinhepghhnuhdrohhrghdpuggrnhhivghlvghnghgvlhdrtghomhdp nhgvthhlihgsrdhorhhgpdhjhhgruhhsvghrrdhushdpuhhirgdrrggtrdgsvgenucfkph epjedurdefiedrleeirddvhedunecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghm pehmrghilhhfrhhomhepghhnuhesuggrnhhivghlvghnghgvlhdrtghomh X-ME-Proxy: <xmx:UDL8X86GedEyXe5jFnfoZcFjDKRTocyx8FvPz4hBdr6S65qMgmc9HQ> <xmx:UDL8X93lpo42D00eBuU-HjQImUGL8QKvJBzznYVOq_GA8sIocKDkaw> <xmx:UDL8X3ExlHmuSKWZC2uej6OhF_8IEYSWhvRFwRCLiE2lnG7dBWEFeg> <xmx:UTL8X9Pvj6NcM3CZqLRzqqpuMT-dIeQqQECjBFoTlBBFjp804LyxdA> Received: from sendmail.lorien.danielengel.com (71-36-96-251.ptld.qwest.net [71.36.96.251]) by mail.messagingengine.com (Postfix) with ESMTPA id 55ED124005B; Mon, 11 Jan 2021 06:11:12 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10BBB8FG041596; Mon, 11 Jan 2021 03:11:09 -0800 (PST) (envelope-from gnu@danielengel.com) From: gnu@danielengel.com To: gcc-patches@gcc.gnu.org Subject: [PATCH v4 00/29] libgcc: Thumb-1 Floating-Point Library for Cortex M0 Date: Mon, 11 Jan 2021 03:10:39 -0800 Message-Id: <cover.1610363037.git.gnu@danielengel.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-5.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, JMQ_SPF_NEUTRAL, KAM_INFOUSMEBIZ, KAM_NUMSUBJECT, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> Cc: Daniel Engel <gnu@danielengel.com>, Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces@gcc.gnu.org> |
Series |
libgcc: Thumb-1 Floating-Point Library for Cortex M0
|
expand
|
From: Daniel Engel <gnu@danielengel.com> This patch revision is based on comments received against: <https://gcc.gnu.org/pipermail/gcc-patches/2021-January/562917.html> As one point of comparison, a test program [1] links 916 bytes from libgcc with the patched toolchain vs 10276 bytes with gcc-arm-none-eabi-9-2020-q2 toolchain. That's a 90% size reduction. I have extensive test vectors [2], and this patch pass all tests on an STM32F051. These vectors were derived from UCB [3], Testfloat [4], and IEEECC754 [5], plus many of my own generation. There may be some follow-on projects worth discussing: * The library is currently integrated into the ARM v6m multilib only. It is likely that some of the architectures would benefit from these routines. However, I have NOT profiled the existing implementations (ieee754-sf.S) to estimate where improvements may be found. * GCC currently lacks test for some functions, such as __aeabi_[u]ldivmod(). There may be useful bits in [1] that can be integrated. On Cortex M0, the functions have (approximately) the following properties: Function(s) Size (bytes) Cycles Stack Accuracy __clzsi2 50 20 0 exact __clzsi2 (OPTIMIZE_SIZE) 22 51 0 exact __clzdi2 8+__clzsi2 4+__clzsi2 0 exact __clrsbsi2 8+__clzsi2 6+__clzsi2 0 exact __clrsbdi2 18+__clzsi2 (8..10)+__clzsi2 0 exact __ctzsi2 52 21 0 exact __ctzsi2 (OPTIMIZE_SIZE) 24 52 0 exact __ctzdi2 8+__ctzsi2 5+__ctzsi2 0 exact __ffssi2 8 6..(5+__ctzsi2) 0 exact __ffsdi2 14+__ctzsi2 9..(8+__ctzsi2) 0 exact __popcountsi2 52 25 0 exact __popcountsi2 (OPTIMIZE_SIZE) 14 9..201 0 exact __popcountdi2 34+__popcountsi2 46 0 exact __popcountdi2 (OPTIMIZE_SIZE) 12+__popcountsi2 17..401 0 exact __paritysi2 24 14 0 exact __paritysi2 (OPTIMIZE_SIZE) 16 38 0 exact __paritydi2 2+__paritysi2 1+__paritysi2 0 exact __umulsidi3 44 24 0 exact __mulsidi3 30+__umulsidi3 24+__umulsidi3 8 exact __muldi3 (__aeabi_lmul) 10+__umulsidi3 6+__umulsidi3 0 exact __ashldi3 (__aeabi_llsl) 22 13 0 exact __lshrdi3 (__aeabi_llsr) 22 13 0 exact __ashrdi3 (__aeabi_lasr) 22 13 0 exact __aeabi_lcmp 20 13 0 exact __aeabi_ulcmp 16 10 0 exact __udivsi3 (__aeabi_uidiv) 56 72..385 0 < 1 lsb __divsi3 (__aeabi_idiv) 38+__udivsi3 26+__udivsi3 8 < 1 lsb __udivdi3 (__aeabi_uldiv) 164 103..1394 16 < 1 lsb __udivdi3 (OPTIMIZE_SIZE) 142 120..1392 16 < 1 lsb __divdi3 (__aeabi_ldiv) 54+__udivdi3 36+__udivdi3 32 < 1 lsb __shared_float 178 __shared_float (OPTIMIZE_SIZE) 154 __addsf3 (__aeabi_fadd) 116+__shared_float 31..76 8 <= 0.5 ulp __addsf3 (OPTIMIZE_SIZE) 112+__shared_float 74 8 <= 0.5 ulp __subsf3 (__aeabi_fsub) 6+__addsf3 3+__addsf3 8 <= 0.5 ulp __aeabi_frsub 8+__addsf3 6+__addsf3 8 <= 0.5 ulp __mulsf3 (__aeabi_fmul) 112+__shared_float 73..97 8 <= 0.5 ulp __mulsf3 (OPTIMIZE_SIZE) 96+__shared_float 93 8 <= 0.5 ulp __divsf3 (__aeabi_fdiv) 132+__shared_float 83..361 8 <= 0.5 ulp __divsf3 (OPTIMIZE_SIZE) 120+__shared_float 263..359 8 <= 0.5 ulp __cmpsf2/__lesf2/__ltsf2 72 33 0 exact __eqsf2/__nesf2 4+__cmpsf2 3+__cmpsf2 0 exact __gesf2/__gesf2 4+__cmpsf2 3+__cmpsf2 0 exact __unordsf2 (__aeabi_fcmpun) 4+__cmpsf2 3+__cmpsf2 0 exact __aeabi_fcmpeq 4+__cmpsf2 3+__cmpsf2 0 exact __aeabi_fcmpne 4+__cmpsf2 3+__cmpsf2 0 exact __aeabi_fcmplt 4+__cmpsf2 3+__cmpsf2 0 exact __aeabi_fcmple 4+__cmpsf2 3+__cmpsf2 0 exact __aeabi_fcmpge 4+__cmpsf2 3+__cmpsf2 0 exact __floatundisf (__aeabi_ul2f) 14+__shared_float 40..81 8 <= 0.5 ulp __floatundisf (OPTIMIZE_SIZE) 14+__shared_float 40..237 8 <= 0.5 ulp __floatunsisf (__aeabi_ui2f) 0+__floatundisf 1+__floatundisf 8 <= 0.5 ulp __floatdisf (__aeabi_l2f) 14+__floatundisf 7+__floatundisf 8 <= 0.5 ulp __floatsisf (__aeabi_i2f) 0+__floatdisf 1+__floatdisf 8 <= 0.5 ulp __fixsfdi (__aeabi_f2lz) 74 27..33 0 exact __fixunssfdi (__aeabi_f2ulz) 4+__fixsfdi 3+__fixsfdi 0 exact __fixsfsi (__aeabi_f2iz) 52 19 0 exact __fixsfsi (OPTIMIZE_SIZE) 4+__fixsfdi 3+__fixsfdi 0 exact __fixunssfsi (__aeabi_f2uiz) 4+__fixsfsi 3+__fixsfsi 0 exact __extendsfdf2 (__aeabi_f2d) 42+__shared_float 38 8 exact __truncsfdf2 (__aeabi_f2d) 88 34 8 exact __aeabi_d2f 56+__shared_float 54..58 8 <= 0.5 ulp __aeabi_h2f 34+__shared_float 34 8 exact __aeabi_f2h 84 23..34 0 <= 0.5 ulp Copyright assignment is on file with the FSF. Thanks, Daniel Engel [1] // Test program for size comparison extern int main (void) { volatile int x = 1; volatile unsigned long long int y = 10; volatile long long int z = x / y; // 64-bit division volatile float a = x; // 32-bit casting volatile float b = y; // 64 bit casting volatile float c = z / b; // float division volatile float d = a + c; // float addition volatile float e = c * b; // float multiplication volatile float f = d - e - c; // float subtraction if (f != c) // float comparison y -= (long long int)d; // float casting } [2] http://danielengel.com/cm0_test_vectors.tgz [3] http://www.netlib.org/fp/ucbtest.tgz [4] http://www.jhauser.us/arithmetic/TestFloat.html [5] http://win-www.uia.ac.be/u/cant/ieeecc754.html