Patchwork [libgfortran] : Use __builtin_ia32_{stmxcsr,ldmxcsr} intrinsics in config/fpu-i387.h

login
register
mail settings
Submitter Uros Bizjak
Date Sept. 5, 2012, 9:37 p.m.
Message ID <CAFULd4ZYcanR0CeiROJEdr_NPKDMf6Fo1YQXN4T9wKGDeHOYog@mail.gmail.com>
Download mbox | patch
Permalink /patch/181962/
State New
Headers show

Comments

Uros Bizjak - Sept. 5, 2012, 9:37 p.m.
On Wed, Sep 5, 2012 at 11:30 PM, Uros Bizjak <ubizjak@gmail.com> wrote:

>>         * config/fpu-387.h (set_fpu): Use __builtin_ia32_stmxcsr and
>>         __builtin_ia32_ldmxcsr intrinsics.
>>
>> Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline an 4.7 branch.
>
> I forgot that these builtins are enabled for SSE only (and x86_64
> bootstrap enables SSE2 by default), so following addition is needed:
>
> --cut here--
> Index: config/fpu-387.h
> ===================================================================
> --- config/fpu-387.h    (revision 190992)
> +++ config/fpu-387.h    (working copy)
> @@ -96,7 +96,11 @@
>  #define _FPU_MASK_UM  0x10
>  #define _FPU_MASK_PM  0x20
>
> -void set_fpu (void)
> +void
> +#ifndef __SSE__
> +__attribute__((__target__("sse")))
> +#endif
> +set_fpu (void)
>  {
>    unsigned short cw;
>
> --cut here--
>
> Re-tested on x86_64-pc-linux-gnu and committed.

... Not really. This option enables cmove, which should not be used on
plain x86_32.

At the end, lets revert back to assembly, with following change that
was intended from the beginning:


Sorry for troubles,
Uros.

Patch

Index: config/fpu-387.h
===================================================================
--- config/fpu-387.h    (revision 190992)
+++ config/fpu-387.h    (working copy)
@@ -112,7 +112,7 @@ 
   if (options.fpe & GFC_FPE_UNDERFLOW) cw &= ~_FPU_MASK_UM;
   if (options.fpe & GFC_FPE_INEXACT) cw &= ~_FPU_MASK_PM;

-  asm volatile ("fldcw %0" : : "m" (cw));
+  asm volatile ("%vstmxcsr %0" : "=m" (cw_sse));

   if (has_sse())
     {
@@ -131,6 +131,6 @@ 
       if (options.fpe & GFC_FPE_UNDERFLOW) cw_sse &= ~(_FPU_MASK_UM << 7);
       if (options.fpe & GFC_FPE_INEXACT) cw_sse &= ~(_FPU_MASK_PM << 7);

-      __builtin_ia32_ldmxcsr (cw_sse);
+      asm volatile ("%vldmxcsr %0" : : "m" (cw_sse));
     }
 }