From patchwork Sat Jun 9 17:34:56 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 163945 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id E96BCB6FD1 for ; Sun, 10 Jun 2012 03:35:24 +1000 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1339868125; h=Comment: DomainKey-Signature:Received:Received:Received:Received: MIME-Version:Received:Received:In-Reply-To:References:Date: Message-ID:Subject:From:To:Content-Type:Mailing-List:Precedence: List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender: Delivered-To; bh=C3kutRcxy/cS76dlKYqm69u4s8g=; b=NuKyUtmmxZ/crTB 75G4Imryn8Q5UgMhMysEKS/+ryXi+oVQKNVCTbPKvwgtJHb0dkz5spsgn1kffmM1 JWVmQougDJ5AR2Q2JbIqc7ACzXxnSd1YIamVPBeMn+0HsH4ofMQkies3hs1XS0g9 jsrTv1a6HeLm7xsN154NB6WuL5zg= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:MIME-Version:Received:Received:In-Reply-To:References:Date:Message-ID:Subject:From:To:Content-Type:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=QfMiNPQ07Fu7CCClLA50j1GbYUqkvz7Y05rrTvggb5oCELQIvBa56l9+Z8Acox +HvH/LujwRiAtUgt2+463TARpkd7ctzi9Og1ZGUkOuyeeBDccgbalnBAcIrJyMt8 jHdrS/KznPOKCSSKa+nXMIwNUzCmgu1SzzFdK5Txfa7h4=; Received: (qmail 17623 invoked by alias); 9 Jun 2012 17:35:18 -0000 Received: (qmail 17446 invoked by uid 22791); 9 Jun 2012 17:35:13 -0000 X-SWARE-Spam-Status: No, hits=-4.9 required=5.0 tests=AWL, BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, KHOP_RCVD_TRUST, KHOP_THREADED, RCVD_IN_DNSWL_LOW, RCVD_IN_HOSTKARMA_YE, TW_ZJ X-Spam-Check-By: sourceware.org Received: from mail-gg0-f175.google.com (HELO mail-gg0-f175.google.com) (209.85.161.175) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sat, 09 Jun 2012 17:34:57 +0000 Received: by ggnp4 with SMTP id p4so1919592ggn.20 for ; Sat, 09 Jun 2012 10:34:56 -0700 (PDT) MIME-Version: 1.0 Received: by 10.236.46.195 with SMTP id r43mr12920660yhb.86.1339263296213; Sat, 09 Jun 2012 10:34:56 -0700 (PDT) Received: by 10.147.111.19 with HTTP; Sat, 9 Jun 2012 10:34:56 -0700 (PDT) In-Reply-To: References: Date: Sat, 9 Jun 2012 19:34:56 +0200 Message-ID: Subject: Re: [PATCH, libgcc]: Put soft-FP exception handler out-of-line for x86 From: Uros Bizjak To: gcc-patches@gcc.gnu.org Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On Thu, Jun 7, 2012 at 9:58 PM, Uros Bizjak wrote: > Attached patch rewrites x86 soft-FP as an out-of-line function that > gets conditionally called when exception occurs. This rewrite removes > 17 instances of the same code from libgcc. In addition, the patch > unifies a lot of code between 32bit and 64bit x86 target, and also > introduces SSE instructions for 32bit targets. There is no other > functional changes, although the change to real SSE FP ops is > tempting. ;) > > 2012-06-07  Uros Bizjak   > >        * config/i386/32/sfp-machine.h (__gcc_CMPtype, CMPtype, >        _FP_KEEPNANFRACP, _FP_CHOOSENAN, FP_EX_INVALID, FP_EX_DENORM, >        FP_EX_DIVZERO, FP_EX_OVERFLOW, FP_EX_UNDERFLOW, FP_EX_INEXACT, >        FP_HANDLE_EXCEPTIONS, FP_RND_NEAREST, FP_RND_ZERO, FP_RND_PINF, >        FP_RND_MINF, _FP_DEXL_EX, FP_INIT_ROUNDMODE, FP_ROUNDMODE, >        __LITTLE_ENDIAN, __BIG_ENDIAN, strong_alias): Move ... >        * config/i386/64/sfp-machine: ... (delete here) ... >        * config/i386/sfp-machine.h: ... to here. >        (FP_EX_MASK): New. >        (__sfp_handle_exceptions): New function declaration. >        (FP_HANDLE_EXCEPTIONS): Use __sfp_handle_exceptions. >        * config/i386/sfp-exceptions.c: New. >        * config/i386/t-softfp: New. >        * config.host (i[34567]86-*-* and x86_64-*-* soft-fp targets): Add >        i386/t-softfp to tmake_file. > > Patch was tested on x86_64-pc-linux-gnu {,-m32}. I have committed attached, slightly changed patch (asm patterns in __sfp_handle_exceptions) to the mainline SVN. Uros. Index: config/i386/sfp-machine.h =================================================================== --- config/i386/sfp-machine.h (revision 188333) +++ config/i386/sfp-machine.h (working copy) @@ -3,8 +3,83 @@ #define _FP_STRUCT_LAYOUT __attribute__ ((gcc_struct)) #endif +/* The type of the result of a floating point comparison. This must + match `__libgcc_cmp_return__' in GCC for the target. */ +typedef int __gcc_CMPtype __attribute__ ((mode (__libgcc_cmp_return__))); +#define CMPtype __gcc_CMPtype + #ifdef __x86_64__ #include "config/i386/64/sfp-machine.h" #else #include "config/i386/32/sfp-machine.h" #endif + +#define _FP_KEEPNANFRACP 1 + +/* Here is something Intel misdesigned: the specs don't define + the case where we have two NaNs with same mantissas, but + different sign. Different operations pick up different NaNs. */ +#define _FP_CHOOSENAN(fs, wc, R, X, Y, OP) \ + do { \ + if (_FP_FRAC_GT_##wc(X, Y) \ + || (_FP_FRAC_EQ_##wc(X,Y) && (OP == '+' || OP == '*'))) \ + { \ + R##_s = X##_s; \ + _FP_FRAC_COPY_##wc(R,X); \ + } \ + else \ + { \ + R##_s = Y##_s; \ + _FP_FRAC_COPY_##wc(R,Y); \ + } \ + R##_c = FP_CLS_NAN; \ + } while (0) + +#define FP_EX_INVALID 0x01 +#define FP_EX_DENORM 0x02 +#define FP_EX_DIVZERO 0x04 +#define FP_EX_OVERFLOW 0x08 +#define FP_EX_UNDERFLOW 0x10 +#define FP_EX_INEXACT 0x20 + +#define FP_EX_MASK 0x3f + +void __sfp_handle_exceptions (int); + +#define FP_HANDLE_EXCEPTIONS \ + do { \ + if (_fex & FP_EX_MASK) \ + __sfp_handle_exceptions (_fex); \ + } while (0); + +#define FP_RND_NEAREST 0 +#define FP_RND_ZERO 0xc00 +#define FP_RND_PINF 0x800 +#define FP_RND_MINF 0x400 + +#define _FP_DECL_EX \ + unsigned short _fcw __attribute__ ((unused)) = FP_RND_NEAREST + +#define FP_INIT_ROUNDMODE \ + do { \ + __asm__ ("fnstcw %0" : "=m" (_fcw)); \ + } while (0) + +#define FP_ROUNDMODE (_fcw & 0xc00) + +#define __LITTLE_ENDIAN 1234 +#define __BIG_ENDIAN 4321 + +#define __BYTE_ORDER __LITTLE_ENDIAN + +/* Define ALIASNAME as a strong alias for NAME. */ +#if defined __MACH__ +/* Mach-O doesn't support aliasing. If these functions ever return + anything but CMPtype we need to revisit this... */ +#define strong_alias(name, aliasname) \ + CMPtype aliasname (TFtype a, TFtype b) { return name(a, b); } +#else +# define strong_alias(name, aliasname) _strong_alias(name, aliasname) +# define _strong_alias(name, aliasname) \ + extern __typeof (name) aliasname __attribute__ ((alias (#name))); +#endif Index: config/i386/64/sfp-machine.h =================================================================== --- config/i386/64/sfp-machine.h (revision 188333) +++ config/i386/64/sfp-machine.h (working copy) @@ -1,5 +1,4 @@ #define _FP_W_TYPE_SIZE 64 - #define _FP_W_TYPE unsigned long long #define _FP_WS_TYPE signed long long #define _FP_I_TYPE long long @@ -9,11 +8,6 @@ #define TI_BITS (__CHAR_BIT__ * (int)sizeof(TItype)) -/* The type of the result of a floating point comparison. This must - match `__libgcc_cmp_return__' in GCC for the target. */ -typedef int __gcc_CMPtype __attribute__ ((mode (__libgcc_cmp_return__))); -#define CMPtype __gcc_CMPtype - #define _FP_MUL_MEAT_Q(R,X,Y) \ _FP_MUL_MEAT_2_wide(_FP_WFRACBITS_Q,R,X,Y,umul_ppmm) @@ -27,126 +21,3 @@ #define _FP_NANSIGN_D 1 #define _FP_NANSIGN_E 1 #define _FP_NANSIGN_Q 1 - -#define _FP_KEEPNANFRACP 1 - -/* Here is something Intel misdesigned: the specs don't define - the case where we have two NaNs with same mantissas, but - different sign. Different operations pick up different NaNs. */ -#define _FP_CHOOSENAN(fs, wc, R, X, Y, OP) \ - do { \ - if (_FP_FRAC_GT_##wc(X, Y) \ - || (_FP_FRAC_EQ_##wc(X,Y) && (OP == '+' || OP == '*'))) \ - { \ - R##_s = X##_s; \ - _FP_FRAC_COPY_##wc(R,X); \ - } \ - else \ - { \ - R##_s = Y##_s; \ - _FP_FRAC_COPY_##wc(R,Y); \ - } \ - R##_c = FP_CLS_NAN; \ - } while (0) - -#define FP_EX_INVALID 0x01 -#define FP_EX_DENORM 0x02 -#define FP_EX_DIVZERO 0x04 -#define FP_EX_OVERFLOW 0x08 -#define FP_EX_UNDERFLOW 0x10 -#define FP_EX_INEXACT 0x20 - -struct fenv -{ - unsigned short int __control_word; - unsigned short int __unused1; - unsigned short int __status_word; - unsigned short int __unused2; - unsigned short int __tags; - unsigned short int __unused3; - unsigned int __eip; - unsigned short int __cs_selector; - unsigned int __opcode:11; - unsigned int __unused4:5; - unsigned int __data_offset; - unsigned short int __data_selector; - unsigned short int __unused5; -}; - -#ifdef __AVX__ - #define ASM_INVALID "vdivss %0, %0, %0" - #define ASM_DIVZERO "vdivss %1, %0, %0" -#else - #define ASM_INVALID "divss %0, %0" - #define ASM_DIVZERO "divss %1, %0" -#endif - -#define FP_HANDLE_EXCEPTIONS \ - do { \ - if (_fex & FP_EX_INVALID) \ - { \ - float f = 0.0; \ - __asm__ __volatile__ (ASM_INVALID : : "x" (f)); \ - } \ - if (_fex & FP_EX_DIVZERO) \ - { \ - float f = 1.0, g = 0.0; \ - __asm__ __volatile__ (ASM_DIVZERO : : "x" (f), "x" (g)); \ - } \ - if (_fex & FP_EX_OVERFLOW) \ - { \ - struct fenv temp; \ - __asm__ __volatile__ ("fnstenv %0" : "=m" (temp)); \ - temp.__status_word |= FP_EX_OVERFLOW; \ - __asm__ __volatile__ ("fldenv %0" : : "m" (temp)); \ - __asm__ __volatile__ ("fwait"); \ - } \ - if (_fex & FP_EX_UNDERFLOW) \ - { \ - struct fenv temp; \ - __asm__ __volatile__ ("fnstenv %0" : "=m" (temp)); \ - temp.__status_word |= FP_EX_UNDERFLOW; \ - __asm__ __volatile__ ("fldenv %0" : : "m" (temp)); \ - __asm__ __volatile__ ("fwait"); \ - } \ - if (_fex & FP_EX_INEXACT) \ - { \ - struct fenv temp; \ - __asm__ __volatile__ ("fnstenv %0" : "=m" (temp)); \ - temp.__status_word |= FP_EX_INEXACT; \ - __asm__ __volatile__ ("fldenv %0" : : "m" (temp)); \ - __asm__ __volatile__ ("fwait"); \ - } \ - } while (0) - -#define FP_RND_NEAREST 0 -#define FP_RND_ZERO 0xc00 -#define FP_RND_PINF 0x800 -#define FP_RND_MINF 0x400 - -#define _FP_DECL_EX \ - unsigned short _fcw __attribute__ ((unused)) = FP_RND_NEAREST - -#define FP_INIT_ROUNDMODE \ - do { \ - __asm__ ("fnstcw %0" : "=m" (_fcw)); \ - } while (0) - -#define FP_ROUNDMODE (_fcw & 0xc00) - -#define __LITTLE_ENDIAN 1234 -#define __BIG_ENDIAN 4321 - -#define __BYTE_ORDER __LITTLE_ENDIAN - -/* Define ALIASNAME as a strong alias for NAME. */ -#if defined __MACH__ -/* Mach-O doesn't support aliasing. If these functions ever return - anything but CMPtype we need to revisit this... */ -#define strong_alias(name, aliasname) \ - CMPtype aliasname (TFtype a, TFtype b) { return name(a, b); } -#else -# define strong_alias(name, aliasname) _strong_alias(name, aliasname) -# define _strong_alias(name, aliasname) \ - extern __typeof (name) aliasname __attribute__ ((alias (#name))); -#endif Index: config/i386/32/sfp-machine.h =================================================================== --- config/i386/32/sfp-machine.h (revision 188333) +++ config/i386/32/sfp-machine.h (working copy) @@ -3,11 +3,6 @@ #define _FP_WS_TYPE signed int #define _FP_I_TYPE int -/* The type of the result of a floating point comparison. This must - match `__libgcc_cmp_return__' in GCC for the target. */ -typedef int __gcc_CMPtype __attribute__ ((mode (__libgcc_cmp_return__))); -#define CMPtype __gcc_CMPtype - #define __FP_FRAC_ADD_4(r3,r2,r1,r0,x3,x2,x1,x0,y3,y2,y1,y0) \ __asm__ ("add{l} {%11,%3|%3,%11}\n\t" \ "adc{l} {%9,%2|%2,%9}\n\t" \ @@ -85,122 +80,3 @@ #define _FP_NANSIGN_D 1 #define _FP_NANSIGN_E 1 #define _FP_NANSIGN_Q 1 - -#define _FP_KEEPNANFRACP 1 - -/* Here is something Intel misdesigned: the specs don't define - the case where we have two NaNs with same mantissas, but - different sign. Different operations pick up different NaNs. */ -#define _FP_CHOOSENAN(fs, wc, R, X, Y, OP) \ - do { \ - if (_FP_FRAC_GT_##wc(X, Y) \ - || (_FP_FRAC_EQ_##wc(X,Y) && (OP == '+' || OP == '*'))) \ - { \ - R##_s = X##_s; \ - _FP_FRAC_COPY_##wc(R,X); \ - } \ - else \ - { \ - R##_s = Y##_s; \ - _FP_FRAC_COPY_##wc(R,Y); \ - } \ - R##_c = FP_CLS_NAN; \ - } while (0) - -#define FP_EX_INVALID 0x01 -#define FP_EX_DENORM 0x02 -#define FP_EX_DIVZERO 0x04 -#define FP_EX_OVERFLOW 0x08 -#define FP_EX_UNDERFLOW 0x10 -#define FP_EX_INEXACT 0x20 - -struct fenv -{ - unsigned short int __control_word; - unsigned short int __unused1; - unsigned short int __status_word; - unsigned short int __unused2; - unsigned short int __tags; - unsigned short int __unused3; - unsigned int __eip; - unsigned short int __cs_selector; - unsigned int __opcode:11; - unsigned int __unused4:5; - unsigned int __data_offset; - unsigned short int __data_selector; - unsigned short int __unused5; -}; - -#define FP_HANDLE_EXCEPTIONS \ - do { \ - if (_fex & FP_EX_INVALID) \ - { \ - float f = 0.0; \ - __asm__ __volatile__ ("fdiv {%y0, %0|%0, %y0}" : "+t" (f)); \ - __asm__ __volatile__ ("fwait"); \ - } \ - if (_fex & FP_EX_DIVZERO) \ - { \ - float f = 1.0, g = 0.0; \ - __asm__ __volatile__ ("fdivp {%0, %y1|%y1, %0}" \ - : "+t" (f) : "u" (g) \ - : "st(1)"); \ - __asm__ __volatile__ ("fwait"); \ - } \ - if (_fex & FP_EX_OVERFLOW) \ - { \ - struct fenv temp; \ - __asm__ __volatile__ ("fnstenv %0" : "=m" (temp)); \ - temp.__status_word |= FP_EX_OVERFLOW; \ - __asm__ __volatile__ ("fldenv %0" : : "m" (temp)); \ - __asm__ __volatile__ ("fwait"); \ - } \ - if (_fex & FP_EX_UNDERFLOW) \ - { \ - struct fenv temp; \ - __asm__ __volatile__ ("fnstenv %0" : "=m" (temp)); \ - temp.__status_word |= FP_EX_UNDERFLOW; \ - __asm__ __volatile__ ("fldenv %0" : : "m" (temp)); \ - __asm__ __volatile__ ("fwait"); \ - } \ - if (_fex & FP_EX_INEXACT) \ - { \ - struct fenv temp; \ - __asm__ __volatile__ ("fnstenv %0" : "=m" (temp)); \ - temp.__status_word |= FP_EX_INEXACT; \ - __asm__ __volatile__ ("fldenv %0" : : "m" (temp)); \ - __asm__ __volatile__ ("fwait"); \ - } \ - } while (0) - -#define FP_RND_NEAREST 0 -#define FP_RND_ZERO 0xc00 -#define FP_RND_PINF 0x800 -#define FP_RND_MINF 0x400 - -#define _FP_DECL_EX \ - unsigned short _fcw __attribute__ ((unused)) = FP_RND_NEAREST - -#define FP_INIT_ROUNDMODE \ - do { \ - __asm__ ("fnstcw %0" : "=m" (_fcw)); \ - } while (0) - -#define FP_ROUNDMODE (_fcw & 0xc00) - -#define __LITTLE_ENDIAN 1234 -#define __BIG_ENDIAN 4321 - -#define __BYTE_ORDER __LITTLE_ENDIAN - -/* Define ALIASNAME as a strong alias for NAME. */ -#if defined __MACH__ -/* Mach-O doesn't support aliasing. If these functions ever return - anything but CMPtype we need to revisit this... */ -#define strong_alias(name, aliasname) \ - CMPtype aliasname (TFtype a, TFtype b) { return name(a, b); } -#else -# define strong_alias(name, aliasname) _strong_alias(name, aliasname) -# define _strong_alias(name, aliasname) \ - extern __typeof (name) aliasname __attribute__ ((alias (#name))); -#endif Index: config/i386/t-softfp =================================================================== --- config/i386/t-softfp (revision 0) +++ config/i386/t-softfp (working copy) @@ -0,0 +1 @@ +LIB2ADD += $(srcdir)/config/i386/sfp-exceptions.c Index: config/i386/sfp-exceptions.c =================================================================== --- config/i386/sfp-exceptions.c (revision 0) +++ config/i386/sfp-exceptions.c (working copy) @@ -0,0 +1,90 @@ +/* + * Copyright (C) 2012 Free Software Foundation, Inc. + * + * This file is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 3, or (at your option) any + * later version. + * + * This file is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * Under Section 7 of GPL version 3, you are granted additional + * permissions described in the GCC Runtime Library Exception, version + * 3.1, as published by the Free Software Foundation. + * + * You should have received a copy of the GNU General Public License and + * a copy of the GCC Runtime Library Exception along with this program; + * see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + * . + */ + +#include "sfp-machine.h" + +struct fenv +{ + unsigned short int __control_word; + unsigned short int __unused1; + unsigned short int __status_word; + unsigned short int __unused2; + unsigned short int __tags; + unsigned short int __unused3; + unsigned int __eip; + unsigned short int __cs_selector; + unsigned int __opcode:11; + unsigned int __unused4:5; + unsigned int __data_offset; + unsigned short int __data_selector; + unsigned short int __unused5; +}; + +void +__sfp_handle_exceptions (int _fex) +{ + if (_fex & FP_EX_INVALID) + { + float f = 0.0f; +#ifdef __SSE__ + asm volatile ("%vdivss\t{%0, %d0|%d0, %0}" : "+x" (f)); +#else + asm volatile ("fdiv\t{%y0, %0|%0, %y0}" : "+t" (f)); + asm volatile ("fwait"); +#endif + } + if (_fex & FP_EX_DIVZERO) + { + float f = 1.0f, g = 0.0f; +#ifdef __SSE__ + asm volatile ("%vdivss\t{%1, %d0|%d0, %1}" : "+x" (f) : "xm" (g)); +#else + asm volatile ("fdivs\t%1" : "+t" (f) : "m" (g)); + asm volatile ("fwait"); +#endif + } + if (_fex & FP_EX_OVERFLOW) + { + struct fenv temp; + asm volatile ("fnstenv\t%0" : "=m" (temp)); + temp.__status_word |= FP_EX_OVERFLOW; + asm volatile ("fldenv\t%0" : : "m" (temp)); + asm volatile ("fwait"); + } + if (_fex & FP_EX_UNDERFLOW) + { + struct fenv temp; + asm volatile ("fnstenv\t%0" : "=m" (temp)); + temp.__status_word |= FP_EX_UNDERFLOW; + asm volatile ("fldenv\t%0" : : "m" (temp)); + asm volatile ("fwait"); + } + if (_fex & FP_EX_INEXACT) + { + struct fenv temp; + asm volatile ("fnstenv\t%0" : "=m" (temp)); + temp.__status_word |= FP_EX_INEXACT; + asm volatile ("fldenv\t%0" : : "m" (temp)); + asm volatile ("fwait"); + } +}; Index: config.host =================================================================== --- config.host (revision 188333) +++ config.host (working copy) @@ -1153,7 +1153,7 @@ if test "${host_address}" = 32; then tmake_file="${tmake_file} i386/${host_address}/t-softfp" fi - tmake_file="${tmake_file} t-softfp" + tmake_file="${tmake_file} i386/t-softfp t-softfp" ;; esac