From patchwork Fri May 22 14:46:12 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 475635 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 7C6B3140B0E for ; Sat, 23 May 2015 00:46:23 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=AOZS5ptc; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:date:message-id:subject:from:to:content-type; q= dns; s=default; b=oTlFZkl//+Vt+n4LacOOrIeuKsJaV9QTlmd+++Np6FFjdz gXSLTGyDMUrbAx9Ppm0wRvQfofmaieAK7seAvjJyZA4cCOY97e3yl5ajuqQp1ge4 usCkwUhIB5WzgsWff3SshXwxc2PqSs5fLVlM0rOpSF/PgO/X5JDz+l0TyB3WA= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:date:message-id:subject:from:to:content-type; s= default; bh=DLbZ3Liy9tHkGqoGDXC3/FhBycQ=; b=AOZS5ptc0t3dzX7K7aSv CMVFk+uSDeZwSlFJ9e/Zt3W3Bk0K4vbpg9G6j3mQBxSc9ids3AgA4CZTbz5yPFf9 AWc4wg2yFPfWpVJLr2HhP606Jc5fQhkq5HfpR73EFZFFKk8wfZkr920C7yAf1fVK Yn4AsHzwM6QGc2I8fYsuMFk= Received: (qmail 116540 invoked by alias); 22 May 2015 14:46:16 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 116527 invoked by uid 89); 22 May 2015 14:46:16 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=no version=3.3.2 X-HELO: mail-ob0-f173.google.com Received: from mail-ob0-f173.google.com (HELO mail-ob0-f173.google.com) (209.85.214.173) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Fri, 22 May 2015 14:46:14 +0000 Received: by obcus9 with SMTP id us9so14615258obc.2 for ; Fri, 22 May 2015 07:46:12 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.202.85.197 with SMTP id j188mr6543690oib.80.1432305972324; Fri, 22 May 2015 07:46:12 -0700 (PDT) Received: by 10.60.147.170 with HTTP; Fri, 22 May 2015 07:46:12 -0700 (PDT) Date: Fri, 22 May 2015 16:46:12 +0200 Message-ID: Subject: [PATCH, i386, libgcc]: Split SSE specific part from set_fast_math From: Uros Bizjak To: "gcc-patches@gcc.gnu.org" Hello! This patch splits SSE specific part of set_fast_math to its own function, decorated with "fxsr,sse" target attribute. This way, we can avoid compiling the whole file with -msse that implies generation of possibly unsupported CMOVE insns. Additionally, we can now use generic t-crtfm makefile fragment. 2015-05-22 Uros Bizjak * config.host (i[34567]-*-*, x86_64-*-*): Add t-crtfm instead of i386/t-crtfm to tmake_file. * config/i386/crtfastmath.c (set_fast_math_sse): New function. (set_fast_math): Use set_fast_math_sse for SSE targets. * config/i386/t-crtfm: Remove. Bootstrapped, regression tested on x86_64-linux-gnu {,-m32} and committed to mainline SVN. Uros. Index: config.host =================================================================== --- config.host (revision 223519) +++ config.host (working copy) @@ -553,12 +553,12 @@ hppa*-*-openbsd*) tmake_file="$tmake_file pa/t-openbsd" ;; i[34567]86-*-darwin*) - tmake_file="$tmake_file i386/t-crtpc i386/t-crtfm" + tmake_file="$tmake_file i386/t-crtpc t-crtfm" tm_file="$tm_file i386/darwin-lib.h" extra_parts="$extra_parts crtprec32.o crtprec64.o crtprec80.o crtfastmath.o" ;; x86_64-*-darwin*) - tmake_file="$tmake_file i386/t-crtpc i386/t-crtfm" + tmake_file="$tmake_file i386/t-crtpc t-crtfm" tm_file="$tm_file i386/darwin-lib.h" extra_parts="$extra_parts crtprec32.o crtprec64.o crtprec80.o crtfastmath.o" ;; @@ -595,24 +595,24 @@ x86_64-*-openbsd*) ;; i[34567]86-*-linux*) extra_parts="$extra_parts crtprec32.o crtprec64.o crtprec80.o crtfastmath.o" - tmake_file="${tmake_file} i386/t-crtpc i386/t-crtfm i386/t-crtstuff t-dfprules" + tmake_file="${tmake_file} i386/t-crtpc t-crtfm i386/t-crtstuff t-dfprules" tm_file="${tm_file} i386/elf-lib.h" md_unwind_header=i386/linux-unwind.h ;; i[34567]86-*-kfreebsd*-gnu | i[34567]86-*-knetbsd*-gnu | i[34567]86-*-gnu* | i[34567]86-*-kopensolaris*-gnu) extra_parts="$extra_parts crtprec32.o crtprec64.o crtprec80.o crtfastmath.o" - tmake_file="${tmake_file} i386/t-crtpc i386/t-crtfm i386/t-crtstuff t-dfprules" + tmake_file="${tmake_file} i386/t-crtpc t-crtfm i386/t-crtstuff t-dfprules" tm_file="${tm_file} i386/elf-lib.h" ;; x86_64-*-linux*) extra_parts="$extra_parts crtprec32.o crtprec64.o crtprec80.o crtfastmath.o" - tmake_file="${tmake_file} i386/t-crtpc i386/t-crtfm i386/t-crtstuff t-dfprules" + tmake_file="${tmake_file} i386/t-crtpc t-crtfm i386/t-crtstuff t-dfprules" tm_file="${tm_file} i386/elf-lib.h" md_unwind_header=i386/linux-unwind.h ;; x86_64-*-kfreebsd*-gnu | x86_64-*-knetbsd*-gnu) extra_parts="$extra_parts crtprec32.o crtprec64.o crtprec80.o crtfastmath.o" - tmake_file="${tmake_file} i386/t-crtpc i386/t-crtfm i386/t-crtstuff t-dfprules" + tmake_file="${tmake_file} i386/t-crtpc t-crtfm i386/t-crtstuff t-dfprules" tm_file="${tm_file} i386/elf-lib.h" ;; i[34567]86-pc-msdosdjgpp*) @@ -628,7 +628,7 @@ i[34567]86-*-rtems*) extra_parts="$extra_parts crti.o crtn.o" ;; i[34567]86-*-solaris2* | x86_64-*-solaris2.1[0-9]*) - tmake_file="$tmake_file i386/t-crtpc i386/t-crtfm" + tmake_file="$tmake_file i386/t-crtpc t-crtfm" extra_parts="$extra_parts crtprec32.o crtprec64.o crtprec80.o crtfastmath.o" tm_file="${tm_file} i386/elf-lib.h" md_unwind_header=i386/sol2-unwind.h @@ -652,7 +652,7 @@ i[34567]86-*-cygwin*) else tmake_dlldir_file="i386/t-dlldir-x" fi - tmake_file="${tmake_file} ${tmake_eh_file} ${tmake_dlldir_file} i386/t-slibgcc-cygming i386/t-cygming i386/t-cygwin i386/t-crtfm i386/t-chkstk t-dfprules" + tmake_file="${tmake_file} ${tmake_eh_file} ${tmake_dlldir_file} i386/t-slibgcc-cygming i386/t-cygming i386/t-cygwin t-crtfm i386/t-chkstk t-dfprules" ;; x86_64-*-cygwin*) extra_parts="crtbegin.o crtbeginS.o crtend.o crtfastmath.o" @@ -672,7 +672,7 @@ x86_64-*-cygwin*) tmake_dlldir_file="i386/t-dlldir-x" fi # FIXME - dj - t-chkstk used to be in here, need a 64-bit version of that - tmake_file="${tmake_file} ${tmake_eh_file} ${tmake_dlldir_file} i386/t-slibgcc-cygming i386/t-cygming i386/t-cygwin i386/t-crtfm t-dfprules i386/t-chkstk" + tmake_file="${tmake_file} ${tmake_eh_file} ${tmake_dlldir_file} i386/t-slibgcc-cygming i386/t-cygming i386/t-cygwin t-crtfm t-dfprules i386/t-chkstk" ;; i[34567]86-*-mingw*) extra_parts="crtbegin.o crtend.o crtfastmath.o" @@ -700,7 +700,7 @@ i[34567]86-*-mingw*) else tmake_dlldir_file="i386/t-dlldir-x" fi - tmake_file="${tmake_file} ${tmake_eh_file} ${tmake_dlldir_file} i386/t-slibgcc-cygming i386/t-cygming i386/t-mingw32 i386/t-crtfm i386/t-chkstk t-dfprules" + tmake_file="${tmake_file} ${tmake_eh_file} ${tmake_dlldir_file} i386/t-slibgcc-cygming i386/t-cygming i386/t-mingw32 t-crtfm i386/t-chkstk t-dfprules" ;; x86_64-*-mingw*) case ${target_thread_file} in @@ -723,7 +723,7 @@ x86_64-*-mingw*) else tmake_dlldir_file="i386/t-dlldir-x" fi - tmake_file="${tmake_file} ${tmake_eh_file} ${tmake_dlldir_file} i386/t-slibgcc-cygming i386/t-cygming i386/t-mingw32 t-dfprules i386/t-crtfm i386/t-chkstk" + tmake_file="${tmake_file} ${tmake_eh_file} ${tmake_dlldir_file} i386/t-slibgcc-cygming i386/t-cygming i386/t-mingw32 t-dfprules t-crtfm i386/t-chkstk" extra_parts="$extra_parts crtbegin.o crtend.o crtfastmath.o" if test x$enable_vtable_verify = xyes; then extra_parts="$extra_parts vtv_start.o vtv_end.o vtv_start_preinit.o vtv_end_preinit.o" Index: config/i386/t-crtfm =================================================================== --- config/i386/t-crtfm (revision 223519) +++ config/i386/t-crtfm (working copy) @@ -1,4 +0,0 @@ -# This is an endfile, Use -minline-all-stringops to ensure -# that __builtin_memset doesn't refer to the lib function memset(). -crtfastmath.o: $(srcdir)/config/i386/crtfastmath.c - $(gcc_compile) -mfxsr -msse -c $< Index: config/i386/crtfastmath.c =================================================================== --- config/i386/crtfastmath.c (revision 223519) +++ config/i386/crtfastmath.c (working copy) @@ -29,15 +29,57 @@ /* All 64-bit targets have SSE and DAZ; only check them explicitly for 32-bit ones. */ #include "cpuid.h" -#endif -static void __attribute__((constructor)) -#ifndef __x86_64__ +__attribute__ ((target("fxsr,sse"))) +static void /* The i386 ABI only requires 4-byte stack alignment, so this is necessary to make sure the fxsave struct gets correct alignment. See PR27537 and PR28621. */ __attribute__ ((force_align_arg_pointer)) +set_fast_math_sse (unsigned int edx) +{ + unsigned int mxcsr; + + if (edx & bit_FXSAVE) + { + /* Check if DAZ is available. */ + struct + { + unsigned short cwd; + unsigned short swd; + unsigned short twd; + unsigned short fop; + unsigned int fip; + unsigned int fcs; + unsigned int foo; + unsigned int fos; + unsigned int mxcsr; + unsigned int mxcsr_mask; + unsigned int st_space[32]; + unsigned int xmm_space[32]; + unsigned int padding[56]; + } __attribute__ ((aligned (16))) fxsave; + + /* This is necessary since some implementations of FXSAVE + do not modify reserved areas within the image. */ + fxsave.mxcsr_mask = 0; + + __builtin_ia32_fxsave (&fxsave); + + mxcsr = fxsave.mxcsr; + + if (fxsave.mxcsr_mask & MXCSR_DAZ) + mxcsr |= MXCSR_DAZ; + } + else + mxcsr = __builtin_ia32_stmxcsr (); + + mxcsr |= MXCSR_FTZ; + __builtin_ia32_ldmxcsr (mxcsr); +} #endif + +static void __attribute__((constructor)) set_fast_math (void) { #ifndef __x86_64__ @@ -47,46 +89,7 @@ set_fast_math (void) return; if (edx & bit_SSE) - { - unsigned int mxcsr; - - if (edx & bit_FXSAVE) - { - /* Check if DAZ is available. */ - struct - { - unsigned short cwd; - unsigned short swd; - unsigned short twd; - unsigned short fop; - unsigned int fip; - unsigned int fcs; - unsigned int foo; - unsigned int fos; - unsigned int mxcsr; - unsigned int mxcsr_mask; - unsigned int st_space[32]; - unsigned int xmm_space[32]; - unsigned int padding[56]; - } __attribute__ ((aligned (16))) fxsave; - - /* This is necessary since some implementations of FXSAVE - do not modify reserved areas within the image. */ - fxsave.mxcsr_mask = 0; - - __builtin_ia32_fxsave (&fxsave); - - mxcsr = fxsave.mxcsr; - - if (fxsave.mxcsr_mask & MXCSR_DAZ) - mxcsr |= MXCSR_DAZ; - } - else - mxcsr = __builtin_ia32_stmxcsr (); - - mxcsr |= MXCSR_FTZ; - __builtin_ia32_ldmxcsr (mxcsr); - } + set_fast_math_sse (edx); #else unsigned int mxcsr = __builtin_ia32_stmxcsr (); mxcsr |= MXCSR_DAZ | MXCSR_FTZ;