diff mbox

[i386] __builtin_ia32_stmxcsr could be pure

Message ID alpine.DEB.2.02.1705261038260.10588@stedding.saclay.inria.fr
State New
Headers show

Commit Message

Marc Glisse May 26, 2017, 8:55 a.m. UTC
Hello,

glibc marks fegetround as a pure function. On x86, people tend to use 
_MM_GET_ROUNDING_MODE instead, which could benefit from the same. I think 
it is safe, but a second opinion would be welcome.

I could have handled just this builtin, but it seemed better to provide 
def_builtin_pure (like "const" already has) since there should be other 
builtins that can be marked this way (maybe the gathers?).

Bootstrap+testsuite on x86_64-pc-linux-gnu with default languages.

2017-05-29  Marc Glisse  <marc.glisse@inria.fr>

gcc/
 	* config/i386/i386.c (struct builtin_isa): New field pure_p.
 	Reorder for compactness.
 	(def_builtin, def_builtin2, ix86_add_new_builtins): Handle pure_p.
 	(def_builtin_pure, def_builtin_pure2): New functions.
 	(ix86_init_mmx_sse_builtins) [__builtin_ia32_stmxcsr]: Mark as pure.

gcc/testsuite/
 	* gcc.target/i386/getround.c: New file.

Comments

Richard Biener May 26, 2017, 9:06 a.m. UTC | #1
On Fri, May 26, 2017 at 10:55 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
> Hello,
>
> glibc marks fegetround as a pure function. On x86, people tend to use
> _MM_GET_ROUNDING_MODE instead, which could benefit from the same. I think it
> is safe, but a second opinion would be welcome.

Sounds good.  The important part is to keep the dependency to SET_ROUNDING_MODE
which is done via claiming both touch global memory.

> I could have handled just this builtin, but it seemed better to provide
> def_builtin_pure (like "const" already has) since there should be other
> builtins that can be marked this way (maybe the gathers?).

Should work for gathers.  They could even use stronger guarantees,
namely a fnspec with "..R" (the pointer argument is only read from directly).
Similarly scatter can use ".W" (the pointer argument is only written to
directly).

Richard.

> Bootstrap+testsuite on x86_64-pc-linux-gnu with default languages.
>
> 2017-05-29  Marc Glisse  <marc.glisse@inria.fr>
>
> gcc/
>         * config/i386/i386.c (struct builtin_isa): New field pure_p.
>         Reorder for compactness.
>         (def_builtin, def_builtin2, ix86_add_new_builtins): Handle pure_p.
>         (def_builtin_pure, def_builtin_pure2): New functions.
>         (ix86_init_mmx_sse_builtins) [__builtin_ia32_stmxcsr]: Mark as pure.
>
> gcc/testsuite/
>         * gcc.target/i386/getround.c: New file.
>
> --
> Marc Glisse
Marc Glisse June 3, 2017, 7:04 p.m. UTC | #2
Hello,

I don't think Richard's "sounds good" was meant as "ok to commit". Does an 
x86 maintainer want to approve or criticize the patch?

https://gcc.gnu.org/ml/gcc-patches/2017-05/msg02009.html

On Fri, 26 May 2017, Richard Biener wrote:

> On Fri, May 26, 2017 at 10:55 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
>> Hello,
>>
>> glibc marks fegetround as a pure function. On x86, people tend to use
>> _MM_GET_ROUNDING_MODE instead, which could benefit from the same. I think it
>> is safe, but a second opinion would be welcome.
>
> Sounds good.  The important part is to keep the dependency to SET_ROUNDING_MODE
> which is done via claiming both touch global memory.
>
>> I could have handled just this builtin, but it seemed better to provide
>> def_builtin_pure (like "const" already has) since there should be other
>> builtins that can be marked this way (maybe the gathers?).
>
> Should work for gathers.  They could even use stronger guarantees,
> namely a fnspec with "..R" (the pointer argument is only read from directly).
> Similarly scatter can use ".W" (the pointer argument is only written to
> directly).
>
> Richard.
>
>> Bootstrap+testsuite on x86_64-pc-linux-gnu with default languages.
>>
>> 2017-05-29  Marc Glisse  <marc.glisse@inria.fr>
>>
>> gcc/
>>         * config/i386/i386.c (struct builtin_isa): New field pure_p.
>>         Reorder for compactness.
>>         (def_builtin, def_builtin2, ix86_add_new_builtins): Handle pure_p.
>>         (def_builtin_pure, def_builtin_pure2): New functions.
>>         (ix86_init_mmx_sse_builtins) [__builtin_ia32_stmxcsr]: Mark as pure.
>>
>> gcc/testsuite/
>>         * gcc.target/i386/getround.c: New file.
>>
>> --
>> Marc Glisse
>
Marc Glisse June 20, 2017, 9:34 p.m. UTC | #3
Ping.

On Sat, 3 Jun 2017, Marc Glisse wrote:

> Hello,
>
> I don't think Richard's "sounds good" was meant as "ok to commit". Does an 
> x86 maintainer want to approve or criticize the patch?
>
> https://gcc.gnu.org/ml/gcc-patches/2017-05/msg02009.html
>
> On Fri, 26 May 2017, Richard Biener wrote:
>
>> On Fri, May 26, 2017 at 10:55 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
>>> Hello,
>>> 
>>> glibc marks fegetround as a pure function. On x86, people tend to use
>>> _MM_GET_ROUNDING_MODE instead, which could benefit from the same. I think 
>>> it
>>> is safe, but a second opinion would be welcome.
>> 
>> Sounds good.  The important part is to keep the dependency to 
>> SET_ROUNDING_MODE
>> which is done via claiming both touch global memory.
>> 
>>> I could have handled just this builtin, but it seemed better to provide
>>> def_builtin_pure (like "const" already has) since there should be other
>>> builtins that can be marked this way (maybe the gathers?).
>> 
>> Should work for gathers.  They could even use stronger guarantees,
>> namely a fnspec with "..R" (the pointer argument is only read from 
>> directly).
>> Similarly scatter can use ".W" (the pointer argument is only written to
>> directly).
>> 
>> Richard.
>> 
>>> Bootstrap+testsuite on x86_64-pc-linux-gnu with default languages.
>>> 
>>> 2017-05-29  Marc Glisse  <marc.glisse@inria.fr>
>>> 
>>> gcc/
>>>         * config/i386/i386.c (struct builtin_isa): New field pure_p.
>>>         Reorder for compactness.
>>>         (def_builtin, def_builtin2, ix86_add_new_builtins): Handle pure_p.
>>>         (def_builtin_pure, def_builtin_pure2): New functions.
>>>         (ix86_init_mmx_sse_builtins) [__builtin_ia32_stmxcsr]: Mark as 
>>> pure.
>>> 
>>> gcc/testsuite/
>>>         * gcc.target/i386/getround.c: New file.
>>> 
>>> --
>>> Marc Glisse
>> 
>
>
diff mbox

Patch

Index: gcc/config/i386/i386.c
===================================================================
--- gcc/config/i386/i386.c	(revision 248449)
+++ gcc/config/i386/i386.c	(working copy)
@@ -31952,25 +31952,26 @@  enum ix86_builtins
   IX86_BUILTIN__BDESC_MAX_LAST = IX86_BUILTIN__BDESC_MAX_FIRST
 };
 
 /* Table for the ix86 builtin decls.  */
 static GTY(()) tree ix86_builtins[(int) IX86_BUILTIN_MAX];
 
 /* Table of all of the builtin functions that are possible with different ISA's
    but are waiting to be built until a function is declared to use that
    ISA.  */
 struct builtin_isa {
-  const char *name;		/* function name */
-  enum ix86_builtin_func_type tcode; /* type to use in the declaration */
   HOST_WIDE_INT isa;		/* isa_flags this builtin is defined for */
   HOST_WIDE_INT isa2;		/* additional isa_flags this builtin is defined for */
-  bool const_p;			/* true if the declaration is constant */
+  const char *name;		/* function name */
+  enum ix86_builtin_func_type tcode; /* type to use in the declaration */
+  unsigned char const_p:1;	/* true if the declaration is constant */
+  unsigned char pure_p:1;	/* true if the declaration has pure attribute */
   bool leaf_p;			/* true if the declaration has leaf attribute */
   bool nothrow_p;		/* true if the declaration has nothrow attribute */
   bool set_and_not_built_p;
 };
 
 static struct builtin_isa ix86_builtins_isa[(int) IX86_BUILTIN_MAX];
 
 /* Bits that can still enable any inclusion of a builtin.  */
 static HOST_WIDE_INT deferred_isa_values = 0;
 static HOST_WIDE_INT deferred_isa_values2 = 0;
@@ -32027,20 +32028,21 @@  def_builtin (HOST_WIDE_INT mask, const c
 	{
 	  /* Just a MASK where set_and_not_built_p == true can potentially
 	     include a builtin.  */
 	  deferred_isa_values |= mask;
 	  ix86_builtins[(int) code] = NULL_TREE;
 	  ix86_builtins_isa[(int) code].tcode = tcode;
 	  ix86_builtins_isa[(int) code].name = name;
 	  ix86_builtins_isa[(int) code].leaf_p = false;
 	  ix86_builtins_isa[(int) code].nothrow_p = false;
 	  ix86_builtins_isa[(int) code].const_p = false;
+	  ix86_builtins_isa[(int) code].pure_p = false;
 	  ix86_builtins_isa[(int) code].set_and_not_built_p = true;
 	}
     }
 
   return decl;
 }
 
 /* Like def_builtin, but also marks the function decl "const".  */
 
 static inline tree
@@ -32049,20 +32051,35 @@  def_builtin_const (HOST_WIDE_INT mask, c
 {
   tree decl = def_builtin (mask, name, tcode, code);
   if (decl)
     TREE_READONLY (decl) = 1;
   else
     ix86_builtins_isa[(int) code].const_p = true;
 
   return decl;
 }
 
+/* Like def_builtin, but also marks the function decl "pure".  */
+
+static inline tree
+def_builtin_pure (HOST_WIDE_INT mask, const char *name,
+		  enum ix86_builtin_func_type tcode, enum ix86_builtins code)
+{
+  tree decl = def_builtin (mask, name, tcode, code);
+  if (decl)
+    DECL_PURE_P (decl) = 1;
+  else
+    ix86_builtins_isa[(int) code].pure_p = true;
+
+  return decl;
+}
+
 /* Like def_builtin, but for additional isa2 flags.  */
 
 static inline tree
 def_builtin2 (HOST_WIDE_INT mask, const char *name,
 	      enum ix86_builtin_func_type tcode,
 	      enum ix86_builtins code)
 {
   tree decl = NULL_TREE;
 
   ix86_builtins_isa[(int) code].isa2 = mask;
@@ -32083,20 +32100,21 @@  def_builtin2 (HOST_WIDE_INT mask, const
     {
       /* Just a MASK where set_and_not_built_p == true can potentially
 	 include a builtin.  */
       deferred_isa_values2 |= mask;
       ix86_builtins[(int) code] = NULL_TREE;
       ix86_builtins_isa[(int) code].tcode = tcode;
       ix86_builtins_isa[(int) code].name = name;
       ix86_builtins_isa[(int) code].leaf_p = false;
       ix86_builtins_isa[(int) code].nothrow_p = false;
       ix86_builtins_isa[(int) code].const_p = false;
+      ix86_builtins_isa[(int) code].pure_p = false;
       ix86_builtins_isa[(int) code].set_and_not_built_p = true;
     }
 
   return decl;
 }
 
 /* Like def_builtin, but also marks the function decl "const".  */
 
 static inline tree
 def_builtin_const2 (HOST_WIDE_INT mask, const char *name,
@@ -32104,20 +32122,35 @@  def_builtin_const2 (HOST_WIDE_INT mask,
 {
   tree decl = def_builtin2 (mask, name, tcode, code);
   if (decl)
     TREE_READONLY (decl) = 1;
   else
     ix86_builtins_isa[(int) code].const_p = true;
 
   return decl;
 }
 
+/* Like def_builtin, but also marks the function decl "pure".  */
+
+static inline tree
+def_builtin_pure2 (HOST_WIDE_INT mask, const char *name,
+		   enum ix86_builtin_func_type tcode, enum ix86_builtins code)
+{
+  tree decl = def_builtin2 (mask, name, tcode, code);
+  if (decl)
+    DECL_PURE_P (decl) = 1;
+  else
+    ix86_builtins_isa[(int) code].pure_p = true;
+
+  return decl;
+}
+
 /* Add any new builtin functions for a given ISA that may not have been
    declared.  This saves a bit of space compared to adding all of the
    declarations to the tree, even if we didn't use them.  */
 
 static void
 ix86_add_new_builtins (HOST_WIDE_INT isa, HOST_WIDE_INT isa2)
 {
   if ((isa & deferred_isa_values) == 0
       && (isa2 & deferred_isa_values2) == 0)
     return;
@@ -32142,20 +32175,22 @@  ix86_add_new_builtins (HOST_WIDE_INT isa
 	  ix86_builtins_isa[i].set_and_not_built_p = false;
 
 	  type = ix86_get_builtin_func_type (ix86_builtins_isa[i].tcode);
 	  decl = add_builtin_function_ext_scope (ix86_builtins_isa[i].name,
 						 type, i, BUILT_IN_MD, NULL,
 						 NULL_TREE);
 
 	  ix86_builtins[i] = decl;
 	  if (ix86_builtins_isa[i].const_p)
 	    TREE_READONLY (decl) = 1;
+	  if (ix86_builtins_isa[i].pure_p)
+	    DECL_PURE_P (decl) = 1;
 	  if (ix86_builtins_isa[i].leaf_p)
 	    DECL_ATTRIBUTES (decl) = build_tree_list (get_identifier ("leaf"),
 						      NULL_TREE);
 	  if (ix86_builtins_isa[i].nothrow_p)
 	    TREE_NOTHROW (decl) = 1;
 	}
     }
 
   current_target_pragma = saved_current_target_pragma;
 }
@@ -32495,22 +32530,22 @@  ix86_init_mmx_sse_builtins (void)
 	ftype = INT_FTYPE_V4SF_V4SF;
       def_builtin_const (d->mask, d->name, ftype, d->code);
     }
   BDESC_VERIFYS (IX86_BUILTIN__BDESC_COMI_LAST,
 		 IX86_BUILTIN__BDESC_COMI_FIRST,
 		 ARRAY_SIZE (bdesc_comi) - 1);
 
   /* SSE */
   def_builtin (OPTION_MASK_ISA_SSE, "__builtin_ia32_ldmxcsr",
 	       VOID_FTYPE_UNSIGNED, IX86_BUILTIN_LDMXCSR);
-  def_builtin (OPTION_MASK_ISA_SSE, "__builtin_ia32_stmxcsr",
-	       UNSIGNED_FTYPE_VOID, IX86_BUILTIN_STMXCSR);
+  def_builtin_pure (OPTION_MASK_ISA_SSE, "__builtin_ia32_stmxcsr",
+		    UNSIGNED_FTYPE_VOID, IX86_BUILTIN_STMXCSR);
 
   /* SSE or 3DNow!A */
   def_builtin (OPTION_MASK_ISA_SSE | OPTION_MASK_ISA_3DNOW_A,
 	       "__builtin_ia32_maskmovq", VOID_FTYPE_V8QI_V8QI_PCHAR,
 	       IX86_BUILTIN_MASKMOVQ);
 
   /* SSE2 */
   def_builtin (OPTION_MASK_ISA_SSE2, "__builtin_ia32_maskmovdqu",
 	       VOID_FTYPE_V16QI_V16QI_PCHAR, IX86_BUILTIN_MASKMOVDQU);
 
Index: gcc/testsuite/gcc.target/i386/getround.c
===================================================================
--- gcc/testsuite/gcc.target/i386/getround.c	(nonexistent)
+++ gcc/testsuite/gcc.target/i386/getround.c	(working copy)
@@ -0,0 +1,14 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O -msse" } */
+
+#include <xmmintrin.h>
+
+unsigned save;
+
+void f(unsigned mode){
+  unsigned tmp = _MM_GET_ROUNDING_MODE();
+  _MM_SET_ROUNDING_MODE(mode);
+  save = tmp;
+}
+
+/* { dg-final { scan-assembler-times "stmxcsr" 1 } } */