Message ID | 1566335985-14601-5-git-send-email-pc@us.ibm.com |
---|---|
State | New |
Headers | show |
Series | various FPSCR optimizations | expand |
On 8/20/19 4:19 PM, Paul A. Clarke wrote: > From: "Paul A. Clarke" <pc@us.ibm.com> > > fegetenv_status() wants to use the lighter weight instruction 'mffsl' > for reading the Floating-Point Status and Control Register (FPSCR). > It currently will use it directly if compiled '-mcpu=power9', and will > perform a runtime check (cpu_supports("arch_3_00")) otherwise. > > Nicely, it turns out that the 'mffsl' instruction will decode to > 'mffs' on architectures older than "arch_3_00" because the additional > bits set for 'mffsl' are "don't care" for 'mffs'. 'mffs' is a superset > of 'mffsl'. That is a pretty neat trick. I would recommend inlining the above comment. Otherwise, LGTM. > -#define fegetenv_status_ISA300() \ > +#define fegetenv_status() \ > ({register double __fr; \ > __asm__ __volatile__ ( \ > ".machine push; .machine \"power9\"; mffsl %0; .machine pop" \ > @@ -45,18 +45,6 @@ extern const fenv_t *__fe_mask_env (void) attribute_hidden; > __fr; \ > })
diff --git a/sysdeps/powerpc/fpu/fenv_libc.h b/sysdeps/powerpc/fpu/fenv_libc.h index 8ba4832..186612b 100644 --- a/sysdeps/powerpc/fpu/fenv_libc.h +++ b/sysdeps/powerpc/fpu/fenv_libc.h @@ -37,7 +37,7 @@ extern const fenv_t *__fe_mask_env (void) attribute_hidden; /* Equivalent to fegetenv_register, but only returns bits for status, exception enables, and mode. */ -#define fegetenv_status_ISA300() \ +#define fegetenv_status() \ ({register double __fr; \ __asm__ __volatile__ ( \ ".machine push; .machine \"power9\"; mffsl %0; .machine pop" \ @@ -45,18 +45,6 @@ extern const fenv_t *__fe_mask_env (void) attribute_hidden; __fr; \ }) -#ifdef _ARCH_PWR9 -# define fegetenv_status() fegetenv_status_ISA300() -#elif defined __BUILTIN_CPU_SUPPORTS__ -# define fegetenv_status() \ - (__glibc_likely (__builtin_cpu_supports ("arch_3_00")) \ - ? fegetenv_status_ISA300() \ - : fegetenv_register() \ - ) -#else -# define fegetenv_status() fegetenv_register () -#endif - /* Equivalent to fesetenv, but takes a fenv_t instead of a pointer. */ #define fesetenv_register(env) \ do { \
From: "Paul A. Clarke" <pc@us.ibm.com> fegetenv_status() wants to use the lighter weight instruction 'mffsl' for reading the Floating-Point Status and Control Register (FPSCR). It currently will use it directly if compiled '-mcpu=power9', and will perform a runtime check (cpu_supports("arch_3_00")) otherwise. Nicely, it turns out that the 'mffsl' instruction will decode to 'mffs' on architectures older than "arch_3_00" because the additional bits set for 'mffsl' are "don't care" for 'mffs'. 'mffs' is a superset of 'mffsl'. So, just generate 'mffsl'. 2019-08-20 Paul A. Clarke <pc@us.ibm.com> * sysdeps/powerpc/fpu/fenv_libc.h (fegetenv_status_ISA300): Delete. (fegetenv_status): Generate 'mffsl' unconditionally. --- sysdeps/powerpc/fpu/fenv_libc.h | 14 +------------- 1 file changed, 1 insertion(+), 13 deletions(-)