diff mbox

[AArch64] Use target builtin instead of __builtin_sqrt for vsqrt_f64

Message ID 54B3EDC9.7080004@arm.com
State New
Headers show

Commit Message

Kyrylo Tkachov Jan. 12, 2015, 3:52 p.m. UTC
Hi all,

As raised in https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01237.html 
and discussed in that thread, using __builtin_sqrt for vsqrt_f64 may end 
up in a call to the library sqrt at -O0. To avoid that this patch uses a 
target builtin for sqrt on DF mode and uses that to implement the intrinsic.

With this patch I don't see sqrt calls being created at -O0 on a large 
arm_neon.h testcase where they were generated before.
aarch64-none-elf testing and the intrinsics testsuite in particular are 
clean.
Ok for trunk?

Thanks,
Kyrill

2015-01-12  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     * config/aarch64/aarch64-simd-builtins.def (sqrt): Use BUILTIN_VDQF_DF.
     * config/aarch64/arm_neon.h (vsqrt_f64): Use __builtin_aarch64_sqrtdf
     instead of __builtin_sqrt.

Comments

Andrew Pinski Jan. 12, 2015, 5:30 p.m. UTC | #1
On Mon, Jan 12, 2015 at 7:52 AM, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:
> Hi all,
>
> As raised in https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01237.html and
> discussed in that thread, using __builtin_sqrt for vsqrt_f64 may end up in a
> call to the library sqrt at -O0. To avoid that this patch uses a target
> builtin for sqrt on DF mode and uses that to implement the intrinsic.
>
> With this patch I don't see sqrt calls being created at -O0 on a large
> arm_neon.h testcase where they were generated before.
> aarch64-none-elf testing and the intrinsics testsuite in particular are
> clean.
> Ok for trunk?

Maybe have a target fold which folds this into sqrt if -fno-math-errno
is supplied.  This might be useful the -ffast-math case.
Maybe also fold it when a constant is supplied too.

Thanks,
Andrew

>
> Thanks,
> Kyrill
>
> 2015-01-12  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>     * config/aarch64/aarch64-simd-builtins.def (sqrt): Use BUILTIN_VDQF_DF.
>     * config/aarch64/arm_neon.h (vsqrt_f64): Use __builtin_aarch64_sqrtdf
>     instead of __builtin_sqrt.
James Greenhalgh Jan. 19, 2015, 3:44 p.m. UTC | #2
On Mon, Jan 12, 2015 at 05:30:46PM +0000, Andrew Pinski wrote:
> On Mon, Jan 12, 2015 at 7:52 AM, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:
> > Hi all,
> >
> > As raised in https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01237.html and
> > discussed in that thread, using __builtin_sqrt for vsqrt_f64 may end up in a
> > call to the library sqrt at -O0. To avoid that this patch uses a target
> > builtin for sqrt on DF mode and uses that to implement the intrinsic.
> >
> > With this patch I don't see sqrt calls being created at -O0 on a large
> > arm_neon.h testcase where they were generated before.
> > aarch64-none-elf testing and the intrinsics testsuite in particular are
> > clean.
> > Ok for trunk?
> 
> Maybe have a target fold which folds this into sqrt if -fno-math-errno
> is supplied.  This might be useful the -ffast-math case.
> Maybe also fold it when a constant is supplied too.

Given that we are now in Stage 4, I'd rather see this fixed for GCC 5.0
in the way Kyrill proposed than languishing on a TODO list. Though an
IOU ticket on bugzilla for the missed optimization seems a good idea
to me.

Unless Kyrill already has something in the works to address your
comment, this looks like the right short-term solution to me
(Though Marcus/Richard will have to approve it).

Thanks,
James

> > 2015-01-12  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
> >
> >     * config/aarch64/aarch64-simd-builtins.def (sqrt): Use BUILTIN_VDQF_DF.
> >     * config/aarch64/arm_neon.h (vsqrt_f64): Use __builtin_aarch64_sqrtdf
> >     instead of __builtin_sqrt.
>
Kyrylo Tkachov Jan. 19, 2015, 3:46 p.m. UTC | #3
On 19/01/15 15:44, James Greenhalgh wrote:
> On Mon, Jan 12, 2015 at 05:30:46PM +0000, Andrew Pinski wrote:
>> On Mon, Jan 12, 2015 at 7:52 AM, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:
>>> Hi all,
>>>
>>> As raised in https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01237.html and
>>> discussed in that thread, using __builtin_sqrt for vsqrt_f64 may end up in a
>>> call to the library sqrt at -O0. To avoid that this patch uses a target
>>> builtin for sqrt on DF mode and uses that to implement the intrinsic.
>>>
>>> With this patch I don't see sqrt calls being created at -O0 on a large
>>> arm_neon.h testcase where they were generated before.
>>> aarch64-none-elf testing and the intrinsics testsuite in particular are
>>> clean.
>>> Ok for trunk?
>> Maybe have a target fold which folds this into sqrt if -fno-math-errno
>> is supplied.  This might be useful the -ffast-math case.
>> Maybe also fold it when a constant is supplied too.
> Given that we are now in Stage 4, I'd rather see this fixed for GCC 5.0
> in the way Kyrill proposed than languishing on a TODO list. Though an
> IOU ticket on bugzilla for the missed optimization seems a good idea
> to me.
>
> Unless Kyrill already has something in the works to address your
> comment, this looks like the right short-term solution to me
> (Though Marcus/Richard will have to approve it).

Sorry, this slipped through the cracks.
I agree with James. A missed-optimization issue on bugzilla would be 
helpful to keep track of this.

Kyrill

>
> Thanks,
> James
>
>>> 2015-01-12  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>>
>>>      * config/aarch64/aarch64-simd-builtins.def (sqrt): Use BUILTIN_VDQF_DF.
>>>      * config/aarch64/arm_neon.h (vsqrt_f64): Use __builtin_aarch64_sqrtdf
>>>      instead of __builtin_sqrt.
Kyrylo Tkachov Jan. 27, 2015, 9:45 a.m. UTC | #4
On 19/01/15 15:46, Kyrill Tkachov wrote:
> On 19/01/15 15:44, James Greenhalgh wrote:
>> On Mon, Jan 12, 2015 at 05:30:46PM +0000, Andrew Pinski wrote:
>>> On Mon, Jan 12, 2015 at 7:52 AM, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:
>>>> Hi all,
>>>>
>>>> As raised in https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01237.html and
>>>> discussed in that thread, using __builtin_sqrt for vsqrt_f64 may end up in a
>>>> call to the library sqrt at -O0. To avoid that this patch uses a target
>>>> builtin for sqrt on DF mode and uses that to implement the intrinsic.
>>>>
>>>> With this patch I don't see sqrt calls being created at -O0 on a large
>>>> arm_neon.h testcase where they were generated before.
>>>> aarch64-none-elf testing and the intrinsics testsuite in particular are
>>>> clean.
>>>> Ok for trunk?
>>> Maybe have a target fold which folds this into sqrt if -fno-math-errno
>>> is supplied.  This might be useful the -ffast-math case.
>>> Maybe also fold it when a constant is supplied too.
>> Given that we are now in Stage 4, I'd rather see this fixed for GCC 5.0
>> in the way Kyrill proposed than languishing on a TODO list. Though an
>> IOU ticket on bugzilla for the missed optimization seems a good idea
>> to me.
>>
>> Unless Kyrill already has something in the works to address your
>> comment, this looks like the right short-term solution to me
>> (Though Marcus/Richard will have to approve it).
> Sorry, this slipped through the cracks.
> I agree with James. A missed-optimization issue on bugzilla would be
> helpful to keep track of this.

I've filed PR 64821 to keep track of this for GCC 6.
Can I ping https://gcc.gnu.org/ml/gcc-patches/2015-01/msg00710.html then?
It's a regression fix at -O0 so should be appropriate for stage4

Thanks,
Kyrill

>
> Kyrill
>
>> Thanks,
>> James
>>
>>>> 2015-01-12  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>>>
>>>>       * config/aarch64/aarch64-simd-builtins.def (sqrt): Use BUILTIN_VDQF_DF.
>>>>       * config/aarch64/arm_neon.h (vsqrt_f64): Use __builtin_aarch64_sqrtdf
>>>>       instead of __builtin_sqrt.
>
>
Kyrylo Tkachov Feb. 4, 2015, 11:06 a.m. UTC | #5
Ping.

https://gcc.gnu.org/ml/gcc-patches/2015-01/msg00710.html


Thanks,
Kyrill

On 27/01/15 09:45, Kyrill Tkachov wrote:
> On 19/01/15 15:46, Kyrill Tkachov wrote:
>> On 19/01/15 15:44, James Greenhalgh wrote:
>>> On Mon, Jan 12, 2015 at 05:30:46PM +0000, Andrew Pinski wrote:
>>>> On Mon, Jan 12, 2015 at 7:52 AM, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:
>>>>> Hi all,
>>>>>
>>>>> As raised in https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01237.html and
>>>>> discussed in that thread, using __builtin_sqrt for vsqrt_f64 may end up in a
>>>>> call to the library sqrt at -O0. To avoid that this patch uses a target
>>>>> builtin for sqrt on DF mode and uses that to implement the intrinsic.
>>>>>
>>>>> With this patch I don't see sqrt calls being created at -O0 on a large
>>>>> arm_neon.h testcase where they were generated before.
>>>>> aarch64-none-elf testing and the intrinsics testsuite in particular are
>>>>> clean.
>>>>> Ok for trunk?
>>>> Maybe have a target fold which folds this into sqrt if -fno-math-errno
>>>> is supplied.  This might be useful the -ffast-math case.
>>>> Maybe also fold it when a constant is supplied too.
>>> Given that we are now in Stage 4, I'd rather see this fixed for GCC 5.0
>>> in the way Kyrill proposed than languishing on a TODO list. Though an
>>> IOU ticket on bugzilla for the missed optimization seems a good idea
>>> to me.
>>>
>>> Unless Kyrill already has something in the works to address your
>>> comment, this looks like the right short-term solution to me
>>> (Though Marcus/Richard will have to approve it).
>> Sorry, this slipped through the cracks.
>> I agree with James. A missed-optimization issue on bugzilla would be
>> helpful to keep track of this.
> I've filed PR 64821 to keep track of this for GCC 6.
> Can I ping https://gcc.gnu.org/ml/gcc-patches/2015-01/msg00710.html then?
> It's a regression fix at -O0 so should be appropriate for stage4
>
> Thanks,
> Kyrill
>
>> Kyrill
>>
>>> Thanks,
>>> James
>>>
>>>>> 2015-01-12  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>>>>
>>>>>        * config/aarch64/aarch64-simd-builtins.def (sqrt): Use BUILTIN_VDQF_DF.
>>>>>        * config/aarch64/arm_neon.h (vsqrt_f64): Use __builtin_aarch64_sqrtdf
>>>>>        instead of __builtin_sqrt.
>>
>
>
Marcus Shawcroft Feb. 4, 2015, 11:38 a.m. UTC | #6
On 12 January 2015 at 15:52, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:
> Hi all,
>
> As raised in https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01237.html and
> discussed in that thread, using __builtin_sqrt for vsqrt_f64 may end up in a
> call to the library sqrt at -O0. To avoid that this patch uses a target
> builtin for sqrt on DF mode and uses that to implement the intrinsic.
>
> With this patch I don't see sqrt calls being created at -O0 on a large
> arm_neon.h testcase where they were generated before.
> aarch64-none-elf testing and the intrinsics testsuite in particular are
> clean.
> Ok for trunk?
>
> Thanks,
> Kyrill
>
> 2015-01-12  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>     * config/aarch64/aarch64-simd-builtins.def (sqrt): Use BUILTIN_VDQF_DF.
>     * config/aarch64/arm_neon.h (vsqrt_f64): Use __builtin_aarch64_sqrtdf
>     instead of __builtin_sqrt.

OK /Marcus
Christophe Lyon Feb. 5, 2015, 8:53 a.m. UTC | #7
On 4 February 2015 at 12:38, Marcus Shawcroft
<marcus.shawcroft@gmail.com> wrote:
> On 12 January 2015 at 15:52, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:
>> Hi all,
>>
>> As raised in https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01237.html and
>> discussed in that thread, using __builtin_sqrt for vsqrt_f64 may end up in a
>> call to the library sqrt at -O0. To avoid that this patch uses a target
>> builtin for sqrt on DF mode and uses that to implement the intrinsic.
>>
>> With this patch I don't see sqrt calls being created at -O0 on a large
>> arm_neon.h testcase where they were generated before.
>> aarch64-none-elf testing and the intrinsics testsuite in particular are
>> clean.
>> Ok for trunk?
>>
>> Thanks,
>> Kyrill
>>
>> 2015-01-12  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>
>>     * config/aarch64/aarch64-simd-builtins.def (sqrt): Use BUILTIN_VDQF_DF.
>>     * config/aarch64/arm_neon.h (vsqrt_f64): Use __builtin_aarch64_sqrtdf
>>     instead of __builtin_sqrt.
>
> OK /Marcus

Hi,

I have noticed that this patch makes the following test fail:
FAIL:  gcc.dg/tree-ssa/foldconst-6.c scan-tree-dump-not ccp1 "666"
on aarch64 targets.

Christophe.
diff mbox

Patch

commit 865be1cc8365886904d571e244746815e2317162
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date:   Fri Jan 9 12:18:59 2015 +0000

    [AArch64] Use target builtin for vsqrt_f64

diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def
index b41d9f6..60cd1d7 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -41,7 +41,7 @@ 
 
   BUILTIN_VDC (COMBINE, combine, 0)
   BUILTIN_VB (BINOP, pmul, 0)
-  BUILTIN_VDQF (UNOP, sqrt, 2)
+  BUILTIN_VDQF_DF (UNOP, sqrt, 2)
   BUILTIN_VD_BHSI (BINOP, addp, 0)
   VAR1 (UNOP, addp, 0, di)
   BUILTIN_VDQ_BHSI (UNOP, clrsb, 2)
diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index c679802..3b151a2 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -22194,7 +22194,7 @@  vsqrtq_f32 (float32x4_t a)
 __extension__ static __inline float64x1_t __attribute__ ((__always_inline__))
 vsqrt_f64 (float64x1_t a)
 {
-  return (float64x1_t) { __builtin_sqrt (a[0]) };
+  return (float64x1_t) { __builtin_aarch64_sqrtdf (a[0]) };
 }
 
 __extension__ static __inline float64x2_t __attribute__ ((__always_inline__))