From patchwork Wed May 8 10:10:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Georg-Johann Lay X-Patchwork-Id: 1932986 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gjlay.de header.i=@gjlay.de header.a=rsa-sha256 header.s=strato-dkim-0002 header.b=VEC/9qUQ; dkim=pass header.d=gjlay.de header.i=@gjlay.de header.a=ed25519-sha256 header.s=strato-dkim-0003 header.b=JGUk1HNI; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4VZ9sJ4MxPz20fc for ; Wed, 8 May 2024 20:11:16 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C4467388E80D for ; Wed, 8 May 2024 10:11:14 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mo4-p00-ob.smtp.rzone.de (mo4-p00-ob.smtp.rzone.de [81.169.146.160]) by sourceware.org (Postfix) with ESMTPS id 12E4A384AB47 for ; Wed, 8 May 2024 10:10:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 12E4A384AB47 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=gjlay.de Authentication-Results: sourceware.org; spf=none smtp.mailfrom=gjlay.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 12E4A384AB47 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=81.169.146.160 ARC-Seal: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1715163058; cv=pass; b=gganbC/jtBey7+qgrHlqsZqbNvuWOQ5jbaL53r7oOED9lqMmTVucvNoiMEs/vkbN5FRplzClbRfnjYPVYLGaWyigE3IF2P+QKjv/J2/Om/1AyR48g8OQhSqQjLpLRcZrlnpNx+x4LNC83UeOHZn24wbKtKfAWNA+BjcPpY5SWFI= ARC-Message-Signature: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1715163058; c=relaxed/simple; bh=ytiYRHxC1uelq2X32gJNSPYwsTC3HD0Bz5Z4EmNxG1c=; h=DKIM-Signature:DKIM-Signature:Message-ID:Date:MIME-Version:From: To:Subject; b=Z7zJ0ON60lAfo6PVVFOGKg4quqZUNGeTXAJNh3QzOaN/PjaGMkXPelQMWz6bWd1HbSOLp13za3TR+9cT/1tBz5ePx1Nrd4iSRO9T1uGgAGkz+Y48Q2zbb57oIcqQpZBKJXhb6ornTtnxiI4+f9QwdBvOWXOByvxVN42MI9O8mZQ= ARC-Authentication-Results: i=2; server2.sourceware.org ARC-Seal: i=1; a=rsa-sha256; t=1715163053; cv=none; d=strato.com; s=strato-dkim-0002; b=L7m4Ej+RvdFxTIlFYjKf/QQQ4QPew1yRg9/L+PsRzQsW1mLIdeBUMI/eKXi4weyeUi DCYLU7mjgchS61x7A1nc6fcraeCvaW3nz/vyCyJSFSRLFiBk+cvx1qwY3QBpBByPNvHm D0pDpcZsvpUAosakaq3y3XQpo6UiSA9ln0Xxzbmtvu+4VCZMdAt/RzYb77Y3BdE3n91Y o1eeHaf6aiy/61/OV+r1mlAOKRaDb7gNmrttcAHsbHzFzLtmVd2VzUrdQBEm6xpFTqEk 5gGCPGohT7IM2SHZWa1IflALKZIFGKQOo0kYchImmSWKzwSMUDGqTDmu1jGMToZpzgcH HkxA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; t=1715163053; s=strato-dkim-0002; d=strato.com; h=Subject:To:From:Date:Message-ID:Cc:Date:From:Subject:Sender; bh=FH7aJ744cLTjeS6Uz5+C291mBOkiFu3BE127UOTT0Ow=; b=YbfUS9wX1VCq8RZiYWRJu5+sPE+xX05u7hHDgQ/F/xu6U4KhFDuXbNg9Lm0tp8gkC7 6irSkG0uucIMLkSVpCs5HdIhV+S/hjocJQmTfhh5jx8cueoNJv6oBEXOVp6OucQLHsoH smnE5GKvmWXIpmObvgGT9GV/r13aD0Ze4uAOb+FaxeYBBqLFxOMw3PkKFsGEtOwiniff hM98b+vsEa/QM4OtBjyny/Kc4S76A7d5QIRipApTOHMM28lSntOExA6VV8RM6UOX2Tvf DXNlxdkMr/PZL7CDOW6NU4q28m0S2ggQArbozrTi+I0XXtTHoIjgBUkGdUm36CnaAJRS xlig== ARC-Authentication-Results: i=1; strato.com; arc=none; dkim=none X-RZG-CLASS-ID: mo00 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1715163052; s=strato-dkim-0002; d=gjlay.de; h=Subject:To:From:Date:Message-ID:Cc:Date:From:Subject:Sender; bh=FH7aJ744cLTjeS6Uz5+C291mBOkiFu3BE127UOTT0Ow=; b=VEC/9qUQXQXHeAz++jYFKHujbPlhzXpo/Tfm4WvwDXP0Bv5xeaz2cUXbyI7Li+NvbV 5G3o/CQrMlceICYvoOoi/53wVPWAinwpSCxuC0zbgkCiD9asSwAmQCPl38mlcfqIm5v1 vO/E6GYxoo8gzGALXOBVUv08CjUWSZ2t5zdnSZT2Ix/Jb9i8XnjywfEwSq4AEctRhfwH F2a28hmXq3XLUpNse8juUb3EBRuE3fGOOPddJJdyoL9ETgcijOmbj1MYV8FN4Dl2ekOL aBYyWVVl9ahIb2hzxI0gQ80bY8fOrWNd703LFcrlBc9R79ebtQTHGjhnqCZLMgqc5Qne 02pg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; t=1715163052; s=strato-dkim-0003; d=gjlay.de; h=Subject:To:From:Date:Message-ID:Cc:Date:From:Subject:Sender; bh=FH7aJ744cLTjeS6Uz5+C291mBOkiFu3BE127UOTT0Ow=; b=JGUk1HNIzuGwnFMz7XbWsY1JeAS/S+M3CMgPKwXMhQAHsKWIRROLZFQWfVaWwgTUXX n5WHo5GHinReQuz11/AQ== X-RZG-AUTH: ":LXoWVUeid/7A29J/hMvvT3koxZnKT7Qq0xotTetVnKkahNK6p2y+LQ4kG4HR" Received: from [192.168.2.102] by smtp.strato.de (RZmta 50.5.0 DYNA|AUTH) with ESMTPSA id xcdf44048AApvI3 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256 bits)) (Client did not present a certificate); Wed, 8 May 2024 12:10:51 +0200 (CEST) Message-ID: <4794f5a4-9199-4c63-b845-60e5eb2ce207@gjlay.de> Date: Wed, 8 May 2024 12:10:51 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird From: Georg-Johann Lay Content-Language: en-US To: "gcc-patches@gcc.gnu.org" , Jeff Law Subject: [patch,avr] PR114981: Implement __builtin_powif in assembly X-Spam-Status: No, score=-11.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org __builtin_powif is currently implemented in C, and this patch implements it (__powisf2) in assembly. Ok for master? Johann --- AVR: target/114981 - Tweak __powisf2 Implement __powisf2 in assembly. PR target/114981 libgcc/ * config/avr/t-avr (LIB2FUNCS_EXCLUDE): Add _powisf2. (LIB1ASMFUNCS) [!avrtiny]: Add _powif. * config/avr/lib1funcs.S (mov4): New .macro. (L_powif, __powisf2) [!avrtiny]: New module and function. testsuite/ * gcc.target/avr/pr114981-powif.c: New test. diff --git a/gcc/testsuite/gcc.target/avr/pr114981-powif.c b/gcc/testsuite/gcc.target/avr/pr114981-powif.c new file mode 100644 index 00000000000..191dcc61e6d --- /dev/null +++ b/gcc/testsuite/gcc.target/avr/pr114981-powif.c @@ -0,0 +1,33 @@ +/* { dg-do run { target { ! avr_tiny } } } */ +/* { dg-additional-options "-Os" } */ + +const float vals[] = + { + 0.0625f, -0.125f, 0.25f, -0.5f, + 1.0f, + -2.0f, 4.0f, -8.0f, 16.0f + }; + +#define ARRAY_SIZE(X) ((int) (sizeof(X) / sizeof(*X))) + +__attribute__((noinline,noclone)) +void test1 (float x) +{ + int i; + + for (i = 0; i < ARRAY_SIZE (vals); ++i) + { + float val0 = vals[i]; + float val1 = __builtin_powif (x, i - 4); + __asm ("" : "+r" (val0)); + + if (val0 != val1) + __builtin_exit (__LINE__); + } +} + +int main (void) +{ + test1 (-2.0f); + return 0; +} diff --git a/libgcc/config/avr/lib1funcs.S b/libgcc/config/avr/lib1funcs.S index 4ac31fa104e..04a4eb01ab4 100644 --- a/libgcc/config/avr/lib1funcs.S +++ b/libgcc/config/avr/lib1funcs.S @@ -80,6 +80,11 @@ #endif .endm +.macro mov4 r_dest, r_src + wmov \r_dest, \r_src + wmov \r_dest+2, \r_src+2 +.endm + #if defined (__AVR_HAVE_JMP_CALL__) #define XCALL call #define XJMP jmp @@ -3312,4 +3317,153 @@ DEFUN __fmul #undef C0 #undef C1 + + +/********************************** + * Floating-Point + **********************************/ + +#if defined (L_powif) +#ifndef __AVR_TINY__ + +;; float output and arg #1 +#define A0 22 +#define A1 A0 + 1 +#define A2 A0 + 2 +#define A3 A0 + 3 + +;; float arg #2 +#define B0 18 +#define B1 B0 + 1 +#define B2 B0 + 2 +#define B3 B0 + 3 + +;; float X: input and iterated squares +#define X0 10 +#define X1 X0 + 1 +#define X2 X0 + 2 +#define X3 X0 + 3 + +;; float Y: expand result +#define Y0 14 +#define Y1 Y0 + 1 +#define Y2 Y0 + 2 +#define Y3 Y0 + 3 + +;; .7 = Sign of I. +;; .0 == 0 => Y = 1.0f implicitly. +#define Flags R9 +#define Y_set 0 + +;;; Integer exponent input. +#define I0 28 +#define I1 I0+1 + +#define ONE 0x3f800000 + +DEFUN __powisf2 + ;; Save 11 Registers: R9...R17, R28, R29 + do_prologue_saves 11 + + ;; Fill local vars with input parameters. + wmov I0, 20 + mov4 X0, A0 + ;; Save sign of exponent for later. + mov Flags, I1 + ;; I := abs (I) + tst I1 + brpl 1f + NEG2 I0 +1: + ;; Y := (I % 2) ? X : 1.0f + ;; (When we come from below, this is like SET, i.e. Flags.Y_set := 1). + bst I0, 0 + ;; Flags.Y_set = false means that we have to assume Y = 1.0f below. + bld Flags, Y_set +2: ;; We have A == X when we come from above. + mov4 Y0, A0 + +.Loop: + ;; while (I >>= 1) + lsr I1 + ror I0 + sbiw I0, 0 + breq .Loop_done + + ;; X := X * X + mov4 A0, X0 +#ifdef __WITH_AVRLIBC__ + XCALL squaref +#else + mov4 B0, X0 + XCALL __mulsf3 +#endif /* Have AVR-LibC? */ + mov4 X0, A0 + + ;; if (I % 2 == 1) Y := Y * X + bst I0, 0 + brtc .Loop + bst Flags, Y_set + ;; When Y is not set => Y := Y * X = 1.0f * X (= A) + ;; Plus, we have to set Y_set = 1 (= I0.0) + brtc 1b + ;; Y is already set: Y := X * Y (= A * Y) + mov4 B0, Y0 + XCALL __mulsf3 + rjmp 2b + + ;; End while +.Loop_done: + + ;; A := 1.0f + ldi A3, hhi8(ONE) + ldi A2, hlo8(ONE) + ldi A1, hi8(ONE) + ldi A0, lo8(ONE) + + ;; When Y is still not set, the result is 1.0f (= A). + bst Flags, Y_set + brtc .Lret + + ;; if (I was < 0) Y = 1.0f / Y + tst Flags + brmi 1f + ;; A := Y + mov4 A0, Y0 + rjmp .Lret +1: ;; A := 1 / Y = A / Y + mov4 B0, Y0 + XCALL __divsf3 + +.Lret: + do_epilogue_restores 11 +ENDF __powisf2 + +#undef A0 +#undef A1 +#undef A2 +#undef A3 + +#undef B0 +#undef B1 +#undef B2 +#undef B3 + +#undef X0 +#undef X1 +#undef X2 +#undef X3 + +#undef Y0 +#undef Y1 +#undef Y2 +#undef Y3 + +#undef I0 +#undef I1 +#undef ONE + +#endif /* __AVR_TINY__ */ +#endif /* L_powif */ + #include "lib1funcs-fixed.S" diff --git a/libgcc/config/avr/t-avr b/libgcc/config/avr/t-avr index ed84b3f342e..971a092aceb 100644 --- a/libgcc/config/avr/t-avr +++ b/libgcc/config/avr/t-avr @@ -68,7 +68,8 @@ LIB1ASMFUNCS += \ _bswapdi2 \ _ashldi3 _ashrdi3 _lshrdi3 _rotldi3 \ _adddi3 _adddi3_s8 _subdi3 \ - _cmpdi2 _cmpdi2_s8 + _cmpdi2 _cmpdi2_s8 \ + _powif endif # Fixed point routines in avr/lib1funcs-fixed.S @@ -110,6 +111,7 @@ LIB2FUNCS_EXCLUDE = \ _moddi3 _umoddi3 \ _clz \ _clrsbdi2 \ + _powisf2 ifeq ($(long_double_type_size),32)