Patchwork [PATCHv3] rs6000: Add 2 built-ins to read the Time Base Register on PowerPC

login
register
mail settings
Submitter Tulio Magno Quites Machado Filho
Date Sept. 17, 2012, 12:53 p.m.
Message ID <1347886403-6989-1-git-send-email-tuliom@linux.vnet.ibm.com>
Download mbox | patch
Permalink /patch/184402/
State New
Headers show

Comments

Tulio Magno Quites Machado Filho - Sept. 17, 2012, 12:53 p.m.
Add __builtin_ppc_get_timebase and __builtin_ppc_mftb to read the Time
Base Register on PowerPC.
They are required by applications that measure time at high frequencies
with high precision that can't afford a syscall.
__builtin_ppc_get_timebase returns the 64 bits of the Time Base Register
while __builtin_ppc_mftb generates only 1 instruction and returns the
least significant word on 32-bit environments and the whole Time Base value
on 64-bit.

[gcc]
2012-09-17 Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com>

	* config/rs6000/rs6000-builtin.def: Add __builtin_ppc_get_timebase
	and __builtin_ppc_mftb.
	* config/rs6000/rs6000.c (rs6000_expand_zeroop_builtin): New
	function to expand an expression that calls a built-in without
	arguments.
	(rs6000_expand_builtin): Add __builtin_ppc_get_timebase and
	__builtin_ppc_mftb.
	(rs6000_init_builtins): Likewise.
	* config/rs6000/rs6000.md: Likewise.
	* doc/extend.texi (PowerPC Built-in Functions): New section.
	(PowerPC AltiVec/VSX Built-in Functions):
	Move some built-ins unrelated to Altivec/VSX to the new section.

[gcc/testsuite]
2012-09-17 Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com>

	* gcc.target/powerpc/ppc-get-timebase.c: New file.
	* gcc.target/powerpc/ppc-mftb.c: New file.
---
 gcc/config/rs6000/rs6000-builtin.def               |    6 ++
 gcc/config/rs6000/rs6000.c                         |   46 +++++++++++
 gcc/config/rs6000/rs6000.md                        |   79 ++++++++++++++++++++
 gcc/doc/extend.texi                                |   51 ++++++++-----
 .../gcc.target/powerpc/ppc-get-timebase.c          |   20 +++++
 gcc/testsuite/gcc.target/powerpc/ppc-mftb.c        |   18 +++++
 6 files changed, 202 insertions(+), 18 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/ppc-get-timebase.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/ppc-mftb.c
David Edelsohn - Sept. 17, 2012, 8:17 p.m.
On Mon, Sep 17, 2012 at 8:53 AM, Tulio Magno Quites Machado Filho
<tuliom@linux.vnet.ibm.com> wrote:
> Add __builtin_ppc_get_timebase and __builtin_ppc_mftb to read the Time
> Base Register on PowerPC.
> They are required by applications that measure time at high frequencies
> with high precision that can't afford a syscall.
> __builtin_ppc_get_timebase returns the 64 bits of the Time Base Register
> while __builtin_ppc_mftb generates only 1 instruction and returns the
> least significant word on 32-bit environments and the whole Time Base value
> on 64-bit.
VSX Built-in Functions):
>         Move some built-ins unrelated to Altivec/VSX to the new section.

This is a lot better and a lot closer.

> +(define_insn "rs6000_get_timebase_ppc32"
> +  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
> +        (unspec_volatile:DI [(const_int 0)] UNSPECV_GETTB))
> +   (clobber (match_scratch:SI 1 "=r"))
> +   (clobber (match_scratch:CC 2 "=y"))]
> +  "!TARGET_POWERPC64"
> +{
> +  if (WORDS_BIG_ENDIAN)
> +    if (TARGET_MFCRF)
> +      {
> +        return "mfspr %0, 269\;"
> +              "mfspr %L0, 268\;"
> +              "mfspr %1, 269\;"
> +              "cmpw %0,%1\;"
> +              "bne- $-16";
> +      }
> +    else
> +      {
> +        return "mftbu %0\;"
> +              "mftb %L0\;"
> +              "mftbu %1\;"
> +              "cmpw %0,%1\;"
> +              "bne- $-16";
> +      }
> +  else
> +    if (TARGET_MFCRF)
> +      {
> +        return "mfspr %L0, 269\;"
> +              "mfspr %0, 268\;"
> +              "mfspr %1, 269\;"
> +              "cmpw %L0,%1\;"
> +              "bne- $-16";
> +      }
> +    else
> +      {
> +        return "mftbu %L0\;"
> +              "mftb %0\;"
> +              "mftbu %1\;"
> +              "cmpw %L0,%1\;"
> +              "bne- $-16";
> +      }
> +})

When Segher said to use "=y" for the condition register to generalize
the code so that it does not always allocate CR0, it also needs
reference the specific condition register field in the emitted
assembly.  You need to tell the processor which CR field the compiler
think is assigned and clobbered.

Thanks, David
Segher Boessenkool - Sept. 18, 2012, 2:50 p.m.
Hi Tulio,

Thanks for all the cleanups!

Two quite minor things...

> +(define_insn "rs6000_get_timebase_ppc64"
> +  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
> +        (unspec_volatile:DI [(const_int 0)] UNSPECV_GETTB))]
> +  "TARGET_POWERPC64"
> +{
> +  if (TARGET_MFCRF)
> +    return "mfspr %0, 268";
> +  else
> +    return "mftb %0";
> +})
> +
> +(define_insn "rs6000_mftb_<mode>"
> +  [(set (match_operand:P 0 "gpc_reg_operand" "=r")
> +        (unspec_volatile:P [(const_int 0)] UNSPECV_MFTB))]
> +  ""
> +  {
> +  if (TARGET_MFCRF)
> +    return "mfspr %0, 268";
> +  else
> +    return "mftb %0";
> +  })

These are identical; remove the _ppc64 pattern?
(The indenting of the {} is wrong in the mftb pattern).


Segher

Patch

diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def
index c8f8f86..9fa3a0f 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -1429,6 +1429,12 @@  BU_SPECIAL_X (RS6000_BUILTIN_RSQRT, "__builtin_rsqrt", RS6000_BTM_FRSQRTE,
 BU_SPECIAL_X (RS6000_BUILTIN_RSQRTF, "__builtin_rsqrtf", RS6000_BTM_FRSQRTES,
 	      RS6000_BTC_FP)
 
+BU_SPECIAL_X (RS6000_BUILTIN_GET_TB, "__builtin_ppc_get_timebase",
+	     RS6000_BTM_ALWAYS, RS6000_BTC_MISC)
+
+BU_SPECIAL_X (RS6000_BUILTIN_MFTB, "__builtin_ppc_mftb",
+	     RS6000_BTM_ALWAYS, RS6000_BTC_MISC)
+
 /* Darwin CfString builtin.  */
 BU_SPECIAL_X (RS6000_BUILTIN_CFSTRING, "__builtin_cfstring", RS6000_BTM_ALWAYS,
 	      RS6000_BTC_MISC)
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index a5a3848..c3bece1 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -9748,6 +9748,30 @@  rs6000_overloaded_builtin_p (enum rs6000_builtins fncode)
   return (rs6000_builtin_info[(int)fncode].attr & RS6000_BTC_OVERLOADED) != 0;
 }
 
+/* Expand an expression EXP that calls a builtin without arguments.  */
+static rtx
+rs6000_expand_zeroop_builtin (enum insn_code icode, rtx target)
+{
+  rtx pat;
+  enum machine_mode tmode = insn_data[icode].operand[0].mode;
+
+  if (icode == CODE_FOR_nothing)
+    /* Builtin not supported on this processor.  */
+    return 0;
+
+  if (target == 0
+      || GET_MODE (target) != tmode
+      || ! (*insn_data[icode].operand[0].predicate) (target, tmode))
+    target = gen_reg_rtx (tmode);
+
+  pat = GEN_FCN (icode) (target);
+  if (! pat)
+    return 0;
+  emit_insn (pat);
+
+  return target;
+}
+
 
 static rtx
 rs6000_expand_unop_builtin (enum insn_code icode, tree exp, rtx target)
@@ -11337,6 +11361,16 @@  rs6000_expand_builtin (tree exp, rtx target, rtx subtarget ATTRIBUTE_UNUSED,
 					   ? CODE_FOR_bpermd_di
 					   : CODE_FOR_bpermd_si), exp, target);
 
+    case RS6000_BUILTIN_GET_TB:
+      return rs6000_expand_zeroop_builtin (CODE_FOR_rs6000_get_timebase,
+					   target);
+
+    case RS6000_BUILTIN_MFTB:
+      return rs6000_expand_zeroop_builtin (((TARGET_64BIT)
+					    ? CODE_FOR_rs6000_mftb_di
+					    : CODE_FOR_rs6000_mftb_si),
+					   target);
+
     case ALTIVEC_BUILTIN_MASK_FOR_LOAD:
     case ALTIVEC_BUILTIN_MASK_FOR_STORE:
       {
@@ -11621,6 +11655,18 @@  rs6000_init_builtins (void)
 				 POWER7_BUILTIN_BPERMD, "__builtin_bpermd");
   def_builtin ("__builtin_bpermd", ftype, POWER7_BUILTIN_BPERMD);
 
+  ftype = build_function_type_list (unsigned_intDI_type_node,
+				    NULL_TREE);
+  def_builtin ("__builtin_ppc_get_timebase", ftype, RS6000_BUILTIN_GET_TB);
+
+  if (TARGET_64BIT)
+    ftype = build_function_type_list (unsigned_intDI_type_node,
+				      NULL_TREE);
+  else
+    ftype = build_function_type_list (unsigned_intSI_type_node,
+				      NULL_TREE);
+  def_builtin ("__builtin_ppc_mftb", ftype, RS6000_BUILTIN_MFTB);
+
 #if TARGET_XCOFF
   /* AIX libm provides clog as __clog.  */
   if ((tdecl = builtin_decl_explicit (BUILT_IN_CLOG)) != NULL_TREE)
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index f2bc15f..0497c20 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -136,6 +136,8 @@ 
    UNSPECV_PROBE_STACK_RANGE	; probe range of stack addresses
    UNSPECV_EH_RR		; eh_reg_restore
    UNSPECV_ISYNC		; isync instruction
+   UNSPECV_GETTB		; get time base built-in
+   UNSPECV_MFTB			; move from time base built-in
   ])
 
 
@@ -14103,6 +14105,83 @@ 
   ""
   "")
 
+(define_expand "rs6000_get_timebase"
+  [(use (match_operand:DI 0 "gpc_reg_operand" ""))]
+  ""
+{
+  if (TARGET_POWERPC64)
+    emit_insn (gen_rs6000_get_timebase_ppc64 (operands[0]));
+  else
+    emit_insn (gen_rs6000_get_timebase_ppc32 (operands[0]));
+  DONE;
+})
+
+(define_insn "rs6000_get_timebase_ppc32"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
+        (unspec_volatile:DI [(const_int 0)] UNSPECV_GETTB))
+   (clobber (match_scratch:SI 1 "=r"))
+   (clobber (match_scratch:CC 2 "=y"))]
+  "!TARGET_POWERPC64"
+{
+  if (WORDS_BIG_ENDIAN)
+    if (TARGET_MFCRF)
+      {
+        return "mfspr %0, 269\;"
+	       "mfspr %L0, 268\;"
+	       "mfspr %1, 269\;"
+	       "cmpw %0,%1\;"
+	       "bne- $-16";
+      }
+    else
+      {
+        return "mftbu %0\;"
+	       "mftb %L0\;"
+	       "mftbu %1\;"
+	       "cmpw %0,%1\;"
+	       "bne- $-16";
+      }
+  else
+    if (TARGET_MFCRF)
+      {
+        return "mfspr %L0, 269\;"
+	       "mfspr %0, 268\;"
+	       "mfspr %1, 269\;"
+	       "cmpw %L0,%1\;"
+	       "bne- $-16";
+      }
+    else
+      {
+        return "mftbu %L0\;"
+	       "mftb %0\;"
+	       "mftbu %1\;"
+	       "cmpw %L0,%1\;"
+	       "bne- $-16";
+      }
+})
+
+(define_insn "rs6000_get_timebase_ppc64"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
+        (unspec_volatile:DI [(const_int 0)] UNSPECV_GETTB))]
+  "TARGET_POWERPC64"
+{
+  if (TARGET_MFCRF)
+    return "mfspr %0, 268";
+  else
+    return "mftb %0";
+})
+
+(define_insn "rs6000_mftb_<mode>"
+  [(set (match_operand:P 0 "gpc_reg_operand" "=r")
+        (unspec_volatile:P [(const_int 0)] UNSPECV_MFTB))]
+  ""
+  {
+  if (TARGET_MFCRF)
+    return "mfspr %0, 268";
+  else
+    return "mftb %0";
+  })
+
+
 
 
 (include "sync.md")
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index e850266..d351770 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -8647,6 +8647,7 @@  instructions, but allow the compiler to schedule those calls.
 * MIPS Loongson Built-in Functions::
 * Other MIPS Built-in Functions::
 * picoChip Built-in Functions::
+* PowerPC Built-in Functions::
 * PowerPC AltiVec/VSX Built-in Functions::
 * RX Built-in Functions::
 * SPARC VIS Built-in Functions::
@@ -11596,6 +11597,38 @@  GCC defines the preprocessor macro @code{___GCC_HAVE_BUILTIN_MIPS_CACHE}
 when this function is available.
 @end table
 
+@node PowerPC Built-in Functions
+@subsection PowerPC Built-in Functions
+
+These built-in functions are available for the PowerPC family of
+processors:
+@smallexample
+float __builtin_recipdivf (float, float);
+float __builtin_rsqrtf (float);
+double __builtin_recipdiv (double, double);
+double __builtin_rsqrt (double);
+long __builtin_bpermd (long, long);
+uint64_t __builtin_ppc_get_timebase ();
+unsigned long __builtin_ppc_mftb ();
+@end smallexample
+
+The @code{vec_rsqrt}, @code{__builtin_rsqrt}, and
+@code{__builtin_rsqrtf} functions generate multiple instructions to
+implement the reciprocal sqrt functionality using reciprocal sqrt
+estimate instructions.
+
+The @code{__builtin_recipdiv}, and @code{__builtin_recipdivf}
+functions generate multiple instructions to implement division using
+the reciprocal estimate instructions.
+
+The @code{__builtin_ppc_get_timebase} and @code{__builtin_ppc_mftb}
+functions generate instructions to read the Time Base Register. The
+@code{__builtin_ppc_get_timebase} function may generate multiple
+instructions and always returns the 64 bits of the Time Base Register. The
+@code{__builtin_ppc_mftb} function always generates one instruction and
+returns the Time Base Register value as an unsigned long, throwing away
+the most significant word on 32-bit environments.
+
 @node PowerPC AltiVec/VSX Built-in Functions
 @subsection PowerPC AltiVec Built-in Functions
 
@@ -13653,24 +13686,6 @@  if the VSX instruction set is available.  The @samp{vec_vsx_ld} and
 @samp{vec_vsx_st} builtins will always generate the VSX @samp{LXVD2X},
 @samp{LXVW4X}, @samp{STXVD2X}, and @samp{STXVW4X} instructions.
 
-GCC provides a few other builtins on Powerpc to access certain instructions:
-@smallexample
-float __builtin_recipdivf (float, float);
-float __builtin_rsqrtf (float);
-double __builtin_recipdiv (double, double);
-double __builtin_rsqrt (double);
-long __builtin_bpermd (long, long);
-@end smallexample
-
-The @code{vec_rsqrt}, @code{__builtin_rsqrt}, and
-@code{__builtin_rsqrtf} functions generate multiple instructions to
-implement the reciprocal sqrt functionality using reciprocal sqrt
-estimate instructions.
-
-The @code{__builtin_recipdiv}, and @code{__builtin_recipdivf}
-functions generate multiple instructions to implement division using
-the reciprocal estimate instructions.
-
 @node RX Built-in Functions
 @subsection RX Built-in Functions
 GCC supports some of the RX instructions which cannot be expressed in
diff --git a/gcc/testsuite/gcc.target/powerpc/ppc-get-timebase.c b/gcc/testsuite/gcc.target/powerpc/ppc-get-timebase.c
new file mode 100644
index 0000000..71de679
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/ppc-get-timebase.c
@@ -0,0 +1,20 @@ 
+/* { dg-do run { target { powerpc*-*-* } } } */
+
+/* Test if __builtin_ppc_get_timebase() is compatible with the current
+   processor and if it's changing between reads.  A read failure might indicate
+   a Power ISA or binutils change.  */
+
+#include <inttypes.h>
+
+int
+main(void)
+{
+  uint64_t t = __builtin_ppc_get_timebase ();
+  int j;
+
+  for (j = 0; j < 1000000; j++)
+    if (t != __builtin_ppc_get_timebase ())
+      return 0;
+
+  return 1;
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/ppc-mftb.c b/gcc/testsuite/gcc.target/powerpc/ppc-mftb.c
new file mode 100644
index 0000000..8eadaaa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/ppc-mftb.c
@@ -0,0 +1,18 @@ 
+/* { dg-do run { target { powerpc*-*-* } } } */
+
+/* Test if __builtin_ppc_mftb() is compatible with the current processor and
+   if it's changing between reads.  A read failure might indicate a Power
+   ISA or binutils change.  */
+
+int
+main(void)
+{
+  unsigned long t = __builtin_ppc_mftb ();
+  int j;
+
+  for (j = 0; j < 1000000; j++)
+    if (t != __builtin_ppc_mftb ())
+      return 0;
+
+  return 1;
+}