diff mbox

[AArch64] Use target builtin instead of __builtin_sqrt for vsqrt_f64

Message ID 20150205102720.GA9072@arm.com
State New
Headers show

Commit Message

James Greenhalgh Feb. 5, 2015, 10:27 a.m. UTC
On Thu, Feb 05, 2015 at 08:53:00AM +0000, Christophe Lyon wrote:
> On 4 February 2015 at 12:38, Marcus Shawcroft
> <marcus.shawcroft@gmail.com> wrote:
> > On 12 January 2015 at 15:52, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:
> >> Hi all,
> >>
> >> As raised in https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01237.html and
> >> discussed in that thread, using __builtin_sqrt for vsqrt_f64 may end up in a
> >> call to the library sqrt at -O0. To avoid that this patch uses a target
> >> builtin for sqrt on DF mode and uses that to implement the intrinsic.
> >>
> >> With this patch I don't see sqrt calls being created at -O0 on a large
> >> arm_neon.h testcase where they were generated before.
> >> aarch64-none-elf testing and the intrinsics testsuite in particular are
> >> clean.
> >> Ok for trunk?
> >>
> >> Thanks,
> >> Kyrill
> >>
> >> 2015-01-12  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
> >>
> >>     * config/aarch64/aarch64-simd-builtins.def (sqrt): Use BUILTIN_VDQF_DF.
> >>     * config/aarch64/arm_neon.h (vsqrt_f64): Use __builtin_aarch64_sqrtdf
> >>     instead of __builtin_sqrt.
> >
> > OK /Marcus
>
> Hi,
>
> I have noticed that this patch makes the following test fail:
> FAIL:  gcc.dg/tree-ssa/foldconst-6.c scan-tree-dump-not ccp1 "666"
> on aarch64 targets.

It seems highly unlikely given what this patch does, and what that
test is looking for, however...

By the sort of exceptional bad luck that only comes around once
every thousand builtins, we now define enough builtins when
compiling for aarch64-none-linux-gnu (and ONLY when compiling
for aarch64-none-linux-gnu, and ONLY when under a test harness!) that
gcc.dg/tree-ssa/foldconst-6.c happens to fail.

The test looks for "666" appearing anywhere in the dumps, but if you
happen to have 2665 other decls kicking around, you may well get
a decl_uid of 2666, as we will with aarch64-none-linux-gnu until we
inevitably add some more builtins.

  ;; Function f (f, funcdef_no=0, decl_uid=2666, cgraph_uid=0, symbol_order=0)

  f (vec * r)
  {
    vec b;
    vec a;

    <bb 2>:
    *r_5(D) = { -1, 0 };
    return;
  }

The patch below modifies the pattern the test is looking for to also
use the first element of the vector. Now the test passes on aarch64. I also
tried hacking out the vector comparison folding code and confirmed that
this caused the test to start failing, so it still works.

I've committed this under the obvious rule as revision 220440.

Cheers,
James

---
2014-02-05  James Greenhalgh  <james.greenhalgh@arm.com>

	* gcc.dg/tree-ssa/foldconst-6.c: Change expected pattern for
	tree dump scanning.
diff mbox

Patch

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/foldconst-6.c b/gcc/testsuite/gcc.dg/tree-ssa/foldconst-6.c
index 0c08f8f..92424b8 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/foldconst-6.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/foldconst-6.c
@@ -10,5 +10,5 @@  void f (vec *r)
   *r = a < b;
 }
 
-/* { dg-final { scan-tree-dump-not "666" "ccp1"} } */
+/* { dg-final { scan-tree-dump-not "2, 666" "ccp1" } } */
 /* { dg-final { cleanup-tree-dump "ccp1" } } */