Message ID | 1347639215-8388-1-git-send-email-tuliom@linux.vnet.ibm.com |
---|---|
State | New |
Headers | show |
Hi Tulio, > @@ -14103,6 +14105,84 @@ > "" > "") > > +(define_expand "rs6000_get_timebase" > + [(use (match_operand:DI 0 "gpc_reg_operand" ""))] > + "" > + " > +{ > + if (TARGET_POWERPC64) > + emit_insn (gen_rs6000_get_timebase_ppc64 (operands[0])); > + else > + emit_insn (gen_rs6000_get_timebase_ppc32 (operands[0])); > + DONE; > +}") Please don't put quotes around the C block. > +(define_insn "rs6000_get_timebase_ppc32" > + [(set (match_operand:DI 0 "gpc_reg_operand" "=r") > + (unspec_volatile:DI [(const_int 0)] UNSPECV_GETTB)) > + (clobber (match_scratch:SI 1 "=r")) > + (clobber (match_scratch:CC 2 "=x"))] You can do "y" instead, to allow all CRn fields. > + "!TARGET_POWERPC64" > +{ > + if (WORDS_BIG_ENDIAN) > + if (TARGET_MFCRF) > + { > + return "mfspr %0, 269\;" > + "mfspr %L0, 268\;" > + "mfspr %1, 269\;" > + "cmpw %0,%1\;" > + "bne- $-16"; > + } > + else > + { > + return "mftbu %0\;" > + "mftb %L0\;" > + "mftbu %1\;" > + "cmpw %0,%1\;" > + "bne- $-16"; > + } > + else > + if (TARGET_MFCRF) > + { > + return "mfspr %L0, 269\;" > + "mfspr %0, 268\;" > + "mfspr %1, 269\;" > + "cmpw %L0,%1\;" > + "bne- $-16"; > + } > + else > + { > + return "mftbu %L0\;" > + "mftb %0\;" > + "mftbu %1\;" > + "cmpw %L0,%1\;" > + "bne- $-16"; > + } > +}) I don't think TARGET_MFCRF is correct. For example, if you use -mcpu=powerpc64 (which doesn't set this flag) you will get code that does not run on the newer machines. > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi > index e850266..b3fc236 100644 > --- a/gcc/doc/extend.texi > +++ b/gcc/doc/extend.texi > @@ -13660,6 +13660,8 @@ float __builtin_rsqrtf (float); > double __builtin_recipdiv (double, double); > double __builtin_rsqrt (double); > long __builtin_bpermd (long, long); > +uint64_t __builtin_ppc_get_timebase (); > +unsigned long __builtin_ppc_mftb (); > @end smallexample This is section "PowerPC AltiVec/VSX Built-in Functions"; there should be a preceding separate "PowerPC Built-in Functions" section. Some of the current documentation should go there, too. > @@ -13671,6 +13673,14 @@ The @code{__builtin_recipdiv}, and @code > {__builtin_recipdivf} > functions generate multiple instructions to implement division using > the reciprocal estimate instructions. > > +The @code{__builtin_ppc_get_timebase} and @code{__builtin_ppc_mftb} > +functions generate instructions to read the Time Base Register. The > +@code{__builtin_ppc_get_timebase} function may generate multiple > +instructions and always return the 64 bits of the Time Base > Register. The s/return/returns/ > +@code{__builtin_ppc_mftb} function always generate one instruction > and generates > +returns the Time Base Register value as an unsigned long, throwing > away > +the most significant word on 32-bit environments. Looks good other than those nits, and the MFCRF thing. Oh, and you didn't say how you tested it (what targets). Segher
Segher Boessenkool <segher@kernel.crashing.org> writes: > Hi Tulio, Hi Segher, >> +(define_expand "rs6000_get_timebase" >> + [(use (match_operand:DI 0 "gpc_reg_operand" ""))] >> + "" >> + " >> +{ >> + if (TARGET_POWERPC64) >> + emit_insn (gen_rs6000_get_timebase_ppc64 (operands[0])); >> + else >> + emit_insn (gen_rs6000_get_timebase_ppc32 (operands[0])); >> + DONE; >> +}") > > Please don't put quotes around the C block. Fixed. > >> +(define_insn "rs6000_get_timebase_ppc32" >> + [(set (match_operand:DI 0 "gpc_reg_operand" "=r") >> + (unspec_volatile:DI [(const_int 0)] UNSPECV_GETTB)) >> + (clobber (match_scratch:SI 1 "=r")) >> + (clobber (match_scratch:CC 2 "=x"))] > > You can do "y" instead, to allow all CRn fields. Fixed. >> + "!TARGET_POWERPC64" >> +{ >> + if (WORDS_BIG_ENDIAN) >> + if (TARGET_MFCRF) >> + { >> + return "mfspr %0, 269\;" >> + "mfspr %L0, 268\;" >> + "mfspr %1, 269\;" >> + "cmpw %0,%1\;" >> + "bne- $-16"; >> + } >> + else >> + { >> + return "mftbu %0\;" >> + "mftb %L0\;" >> + "mftbu %1\;" >> + "cmpw %0,%1\;" >> + "bne- $-16"; >> + } >> + else >> + if (TARGET_MFCRF) >> + { >> + return "mfspr %L0, 269\;" >> + "mfspr %0, 268\;" >> + "mfspr %1, 269\;" >> + "cmpw %L0,%1\;" >> + "bne- $-16"; >> + } >> + else >> + { >> + return "mftbu %L0\;" >> + "mftb %0\;" >> + "mftbu %1\;" >> + "cmpw %L0,%1\;" >> + "bne- $-16"; >> + } >> +}) > > I don't think TARGET_MFCRF is correct. For example, if you use > -mcpu=powerpc64 (which doesn't set this flag) you will get code > that does not run on the newer machines. Sorry, but it seems to be working here... I explain how I tested this in the end of the email. >> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi >> index e850266..b3fc236 100644 >> --- a/gcc/doc/extend.texi >> +++ b/gcc/doc/extend.texi >> @@ -13660,6 +13660,8 @@ float __builtin_rsqrtf (float); >> double __builtin_recipdiv (double, double); >> double __builtin_rsqrt (double); >> long __builtin_bpermd (long, long); >> +uint64_t __builtin_ppc_get_timebase (); >> +unsigned long __builtin_ppc_mftb (); >> @end smallexample > > This is section "PowerPC AltiVec/VSX Built-in Functions"; there > should be a preceding separate "PowerPC Built-in Functions" > section. > Some of the current documentation should go there, too. Agreed. I'm changing this. >> @@ -13671,6 +13673,14 @@ The @code{__builtin_recipdiv}, and >> @code{__builtin_recipdivf} >> functions generate multiple instructions to implement division >> using >> the reciprocal estimate instructions. >> >> +The @code{__builtin_ppc_get_timebase} and >> @code{__builtin_ppc_mftb} >> +functions generate instructions to read the Time Base >> Register. The >> +@code{__builtin_ppc_get_timebase} function may generate >> multiple >> +instructions and always return the 64 bits of the Time Base >> Register. The > > s/return/returns/ Fixed. >> +@code{__builtin_ppc_mftb} function always generate one >> instruction and > > generates Fixed. >> +returns the Time Base Register value as an unsigned long, >> throwing away >> +the most significant word on 32-bit environments. > > > Looks good other than those nits, and the MFCRF thing. Oh, and > you > didn't say how you tested it (what targets). Sorry, I run the test suite on a Power7 running a 64-bit environment. Then, I manually built and run a modified copy of the 2 included tests that that only print the TBR values a 3 times to make sure they were increasing. I did this for both 32 and 64 bits on a Power3 and a Power7, making sure to use the correct -mcpu. In the end I checked that the generated assembly was right for Power3, Power7 and some random chooses I've done. Here are some snippets of code for __builtin_ppc_get_timebase, i.e.: - -m64 -mcpu=power7 .L.main: std 31,-8(1) stdu 1,-80(1) mr 31,1 mfspr 9, 268 std 9,56(31) li 9,0 stw 9,48(31) b .L2 - -m32 -mcpu=power3 main: stwu 1,-48(1) stw 31,44(1) mr 31,1 mftbu 8 mftb 9 mftbu 10 cmpw 8,10 bne- $-16 stw 8,24(31) stw 9,28(31) lfd 0,24(31) stfd 0,16(31) li 9,0 stw 9,8(31) b .L2 - -m64 -mcpu=power3 .L.main: std 31,-8(1) stdu 1,-80(1) mr 31,1 mftb 9 std 9,56(31) li 9,0 stw 9,48(31) b .L2 Thanks,
>> I don't think TARGET_MFCRF is correct. For example, if you use >> -mcpu=powerpc64 (which doesn't set this flag) you will get code >> that does not run on the newer machines. > > Sorry, but it seems to be working here... > I explain how I tested this in the end of the email. David tells me all current CPUs actually do support the MFTB insns just fine, so that there is no problem. Cheers, Segher
On Fri, Sep 14, 2012 at 4:52 PM, Segher Boessenkool <segher@kernel.crashing.org> wrote: >>> I don't think TARGET_MFCRF is correct. For example, if you use >>> -mcpu=powerpc64 (which doesn't set this flag) you will get code >>> that does not run on the newer machines. >> >> Sorry, but it seems to be working here... >> I explain how I tested this in the end of the email. > > David tells me all current CPUs actually do support the MFTB insns > just fine, so that there is no problem. There is no ideal solution and this method will work on all PowerPC targets with a combination of GCC and Binutils support., so I think this is the best we can do. Thanks, David
Segher Boessenkool <segher@kernel.crashing.org> writes: >>> I don't think TARGET_MFCRF is correct. For example, if you >>> use >>> -mcpu=powerpc64 (which doesn't set this flag) you will get >>> code >>> that does not run on the newer machines. >> >> Sorry, but it seems to be working here... >> I explain how I tested this in the end of the email. > > David tells me all current CPUs actually do support the MFTB > insns > just fine, so that there is no problem. I don't understand the problem. Let me paste again this snippet of code here: It was generated using the __builtin_ppc_get_timebase test case with the extra parameters -S -m64 -mcpu=power7. The rest is the same used by the test suite. .L.main: std 31,-8(1) stdu 1,-80(1) mr 31,1 mfspr 9, 268 std 9,56(31) li 9,0 stw 9,48(31) b .L2 I've just done this same test for all -mcpu values from power3 through power7 and both -m32 and -m64. The only value that outputs mftb is power3 in both environments. What am I missing?
On Fri, Sep 14, 2012 at 8:34 PM, Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com> wrote: > Segher Boessenkool <segher@kernel.crashing.org> writes: > I don't understand the problem. There is no problem. Segher is concerned that -mcpu=powerpc64, which is suppose to generate "generic" PPC64 code produces mftb instead of mfspr. However, there is no right answer and it is unclear if it is better for -mcpu=powerpc64 to support the most processors now or only forward compatibility for the future. for the moment, working on all PPC64 systems seems like the better option. we can revisit this in the future when POWER3 is even father in the past. - david
> There is no problem. > > Segher is concerned that -mcpu=powerpc64, which is suppose to generate > "generic" PPC64 code produces mftb instead of mfspr. And the MFTB instruction is phased out of the architecture. I was under the impression that now or in the near future there would be implementations where MFTB does not work. Hopefully this will never happen. > However, there > is no right answer and it is unclear if it is better for > -mcpu=powerpc64 to support the most processors now or only forward > compatibility for the future. for the moment, working on all PPC64 > systems seems like the better option. we can revisit this in the > future when POWER3 is even father in the past. Yes, my concern is that if you compile code as "generic", it will not work on too new systems. This would be rather unfortunate. There are more catches... You focus on 64-bit, but MFTB is also phased-out for 32-bit implementations. Also, assemblers will under certain circumstances emit MFSPR insns where you wrote MFTB. It's a huge mess. Let's just use what you have now, and we can tune this decision in the future. The far, far future, hopefully :-) Segher
diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def index c8f8f86..9fa3a0f 100644 --- a/gcc/config/rs6000/rs6000-builtin.def +++ b/gcc/config/rs6000/rs6000-builtin.def @@ -1429,6 +1429,12 @@ BU_SPECIAL_X (RS6000_BUILTIN_RSQRT, "__builtin_rsqrt", RS6000_BTM_FRSQRTE, BU_SPECIAL_X (RS6000_BUILTIN_RSQRTF, "__builtin_rsqrtf", RS6000_BTM_FRSQRTES, RS6000_BTC_FP) +BU_SPECIAL_X (RS6000_BUILTIN_GET_TB, "__builtin_ppc_get_timebase", + RS6000_BTM_ALWAYS, RS6000_BTC_MISC) + +BU_SPECIAL_X (RS6000_BUILTIN_MFTB, "__builtin_ppc_mftb", + RS6000_BTM_ALWAYS, RS6000_BTC_MISC) + /* Darwin CfString builtin. */ BU_SPECIAL_X (RS6000_BUILTIN_CFSTRING, "__builtin_cfstring", RS6000_BTM_ALWAYS, RS6000_BTC_MISC) diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index a5a3848..c3bece1 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -9748,6 +9748,30 @@ rs6000_overloaded_builtin_p (enum rs6000_builtins fncode) return (rs6000_builtin_info[(int)fncode].attr & RS6000_BTC_OVERLOADED) != 0; } +/* Expand an expression EXP that calls a builtin without arguments. */ +static rtx +rs6000_expand_zeroop_builtin (enum insn_code icode, rtx target) +{ + rtx pat; + enum machine_mode tmode = insn_data[icode].operand[0].mode; + + if (icode == CODE_FOR_nothing) + /* Builtin not supported on this processor. */ + return 0; + + if (target == 0 + || GET_MODE (target) != tmode + || ! (*insn_data[icode].operand[0].predicate) (target, tmode)) + target = gen_reg_rtx (tmode); + + pat = GEN_FCN (icode) (target); + if (! pat) + return 0; + emit_insn (pat); + + return target; +} + static rtx rs6000_expand_unop_builtin (enum insn_code icode, tree exp, rtx target) @@ -11337,6 +11361,16 @@ rs6000_expand_builtin (tree exp, rtx target, rtx subtarget ATTRIBUTE_UNUSED, ? CODE_FOR_bpermd_di : CODE_FOR_bpermd_si), exp, target); + case RS6000_BUILTIN_GET_TB: + return rs6000_expand_zeroop_builtin (CODE_FOR_rs6000_get_timebase, + target); + + case RS6000_BUILTIN_MFTB: + return rs6000_expand_zeroop_builtin (((TARGET_64BIT) + ? CODE_FOR_rs6000_mftb_di + : CODE_FOR_rs6000_mftb_si), + target); + case ALTIVEC_BUILTIN_MASK_FOR_LOAD: case ALTIVEC_BUILTIN_MASK_FOR_STORE: { @@ -11621,6 +11655,18 @@ rs6000_init_builtins (void) POWER7_BUILTIN_BPERMD, "__builtin_bpermd"); def_builtin ("__builtin_bpermd", ftype, POWER7_BUILTIN_BPERMD); + ftype = build_function_type_list (unsigned_intDI_type_node, + NULL_TREE); + def_builtin ("__builtin_ppc_get_timebase", ftype, RS6000_BUILTIN_GET_TB); + + if (TARGET_64BIT) + ftype = build_function_type_list (unsigned_intDI_type_node, + NULL_TREE); + else + ftype = build_function_type_list (unsigned_intSI_type_node, + NULL_TREE); + def_builtin ("__builtin_ppc_mftb", ftype, RS6000_BUILTIN_MFTB); + #if TARGET_XCOFF /* AIX libm provides clog as __clog. */ if ((tdecl = builtin_decl_explicit (BUILT_IN_CLOG)) != NULL_TREE) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index f2bc15f..41304c4 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -136,6 +136,8 @@ UNSPECV_PROBE_STACK_RANGE ; probe range of stack addresses UNSPECV_EH_RR ; eh_reg_restore UNSPECV_ISYNC ; isync instruction + UNSPECV_GETTB ; get time base built-in + UNSPECV_MFTB ; move from time base built-in ]) @@ -14103,6 +14105,84 @@ "" "") +(define_expand "rs6000_get_timebase" + [(use (match_operand:DI 0 "gpc_reg_operand" ""))] + "" + " +{ + if (TARGET_POWERPC64) + emit_insn (gen_rs6000_get_timebase_ppc64 (operands[0])); + else + emit_insn (gen_rs6000_get_timebase_ppc32 (operands[0])); + DONE; +}") + +(define_insn "rs6000_get_timebase_ppc32" + [(set (match_operand:DI 0 "gpc_reg_operand" "=r") + (unspec_volatile:DI [(const_int 0)] UNSPECV_GETTB)) + (clobber (match_scratch:SI 1 "=r")) + (clobber (match_scratch:CC 2 "=x"))] + "!TARGET_POWERPC64" +{ + if (WORDS_BIG_ENDIAN) + if (TARGET_MFCRF) + { + return "mfspr %0, 269\;" + "mfspr %L0, 268\;" + "mfspr %1, 269\;" + "cmpw %0,%1\;" + "bne- $-16"; + } + else + { + return "mftbu %0\;" + "mftb %L0\;" + "mftbu %1\;" + "cmpw %0,%1\;" + "bne- $-16"; + } + else + if (TARGET_MFCRF) + { + return "mfspr %L0, 269\;" + "mfspr %0, 268\;" + "mfspr %1, 269\;" + "cmpw %L0,%1\;" + "bne- $-16"; + } + else + { + return "mftbu %L0\;" + "mftb %0\;" + "mftbu %1\;" + "cmpw %L0,%1\;" + "bne- $-16"; + } +}) + +(define_insn "rs6000_get_timebase_ppc64" + [(set (match_operand:DI 0 "gpc_reg_operand" "=r") + (unspec_volatile:DI [(const_int 0)] UNSPECV_GETTB))] + "TARGET_POWERPC64" +{ + if (TARGET_MFCRF) + return "mfspr %0, 268"; + else + return "mftb %0"; +}) + +(define_insn "rs6000_mftb_<mode>" + [(set (match_operand:P 0 "gpc_reg_operand" "=r") + (unspec_volatile:P [(const_int 0)] UNSPECV_MFTB))] + "" + { + if (TARGET_MFCRF) + return "mfspr %0, 268"; + else + return "mftb %0"; + }) + + (include "sync.md") diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index e850266..b3fc236 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -13660,6 +13660,8 @@ float __builtin_rsqrtf (float); double __builtin_recipdiv (double, double); double __builtin_rsqrt (double); long __builtin_bpermd (long, long); +uint64_t __builtin_ppc_get_timebase (); +unsigned long __builtin_ppc_mftb (); @end smallexample The @code{vec_rsqrt}, @code{__builtin_rsqrt}, and @@ -13671,6 +13673,14 @@ The @code{__builtin_recipdiv}, and @code{__builtin_recipdivf} functions generate multiple instructions to implement division using the reciprocal estimate instructions. +The @code{__builtin_ppc_get_timebase} and @code{__builtin_ppc_mftb} +functions generate instructions to read the Time Base Register. The +@code{__builtin_ppc_get_timebase} function may generate multiple +instructions and always return the 64 bits of the Time Base Register. The +@code{__builtin_ppc_mftb} function always generate one instruction and +returns the Time Base Register value as an unsigned long, throwing away +the most significant word on 32-bit environments. + @node RX Built-in Functions @subsection RX Built-in Functions GCC supports some of the RX instructions which cannot be expressed in diff --git a/gcc/testsuite/gcc.target/powerpc/ppc-get-timebase.c b/gcc/testsuite/gcc.target/powerpc/ppc-get-timebase.c new file mode 100644 index 0000000..71de679 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/ppc-get-timebase.c @@ -0,0 +1,20 @@ +/* { dg-do run { target { powerpc*-*-* } } } */ + +/* Test if __builtin_ppc_get_timebase() is compatible with the current + processor and if it's changing between reads. A read failure might indicate + a Power ISA or binutils change. */ + +#include <inttypes.h> + +int +main(void) +{ + uint64_t t = __builtin_ppc_get_timebase (); + int j; + + for (j = 0; j < 1000000; j++) + if (t != __builtin_ppc_get_timebase ()) + return 0; + + return 1; +} diff --git a/gcc/testsuite/gcc.target/powerpc/ppc-mftb.c b/gcc/testsuite/gcc.target/powerpc/ppc-mftb.c new file mode 100644 index 0000000..8eadaaa --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/ppc-mftb.c @@ -0,0 +1,18 @@ +/* { dg-do run { target { powerpc*-*-* } } } */ + +/* Test if __builtin_ppc_mftb() is compatible with the current processor and + if it's changing between reads. A read failure might indicate a Power + ISA or binutils change. */ + +int +main(void) +{ + unsigned long t = __builtin_ppc_mftb (); + int j; + + for (j = 0; j < 1000000; j++) + if (t != __builtin_ppc_mftb ()) + return 0; + + return 1; +}