diff mbox

Ping with testcase: [PATCH][AArch64] Fix __builtin_aarch64_absdi, must not fold to ABS_EXPR

Message ID 547EE633.2090506@arm.com
State New
Headers show

Commit Message

Alan Lawrence Dec. 3, 2014, 10:30 a.m. UTC
On Wed, Nov 26, 2014 at 04:35:50PM +0000, James Greenhalgh wrote:
 > Why do we want to turn off folding for the V4SF/V2SF/V2DF modes of these
 > intrinsics? There should be no difference between the mid-end definition
 > and the intrinsic definition of their behaviour.

Good point. Done.

 > I also note that the integer forms of these now end up as an "abs" RTL
 > expression - can we guarantee that preserves the intrinsics behaviour and
 > no RTL-folder will come along and mis-optimize? Documentation is vague
 > on this point.

I don't think we can guarantee that indefinitely, but it doesn't seem anyone has 
implemented such a rule *at present*. And we'd lose e.g. constant folding, if we 
switched to using an AArch64-specific UNSPEC. If anyone does add such a rule, 
I'd have some hope that the testcase will catch it...

 > I'm also not convinced by the SIMD_ARG_SCRATCH foo you add. Looking at
 > the aarch64.md:absdi2 pattern I can't see why we need that scratch at
 > all. It seems we could get away with marking operand 0 early-clobber and
 > using it as our scratch register. Then we could drop all the extra
 > infrastructure from this patch.

Hah, good point. Ok, I attach a revised patch, updating the pattern as you 
suggest and omitting the infrastructure changes. (I'm leaving the now-unused 
qualifier_internal as it stands, as there are arguments both for removing it and 
"fixing" it as per previous patch, but I don't think we should hold this up in 
the case that we want to have that discussion!)

Cross-tested check-gcc on aarch64-none-elf, including previously-posted test of 
vabs_s8.

Ok for trunk (with previously-posted testcase) ?
--Alan

gcc/ChangeLog:

	* config/aarch64/aarch64.md (absdi2): Remove scratch operand by
	earlyclobbering result operand.

	* config/aarch64/aarch64-builtins.c (aarch64_types_unop_qualifiers):
	Remove final qualifier_internal.
	(aarch64_fold_builtin): Stop folding abs builtins, except on floats.

Comments

Marcus Shawcroft Dec. 8, 2014, 1:34 p.m. UTC | #1
On 3 December 2014 at 10:30, Alan Lawrence <alan.lawrence@arm.com> wrote:
> On Wed, Nov 26, 2014 at 04:35:50PM +0000, James Greenhalgh wrote:
>> Why do we want to turn off folding for the V4SF/V2SF/V2DF modes of these
>> intrinsics? There should be no difference between the mid-end definition
>> and the intrinsic definition of their behaviour.
>
> Good point. Done.
>
>> I also note that the integer forms of these now end up as an "abs" RTL
>> expression - can we guarantee that preserves the intrinsics behaviour and
>> no RTL-folder will come along and mis-optimize? Documentation is vague
>> on this point.
>
> I don't think we can guarantee that indefinitely, but it doesn't seem anyone
> has implemented such a rule *at present*. And we'd lose e.g. constant
> folding, if we switched to using an AArch64-specific UNSPEC. If anyone does
> add such a rule, I'd have some hope that the testcase will catch it...
>
>> I'm also not convinced by the SIMD_ARG_SCRATCH foo you add. Looking at
>> the aarch64.md:absdi2 pattern I can't see why we need that scratch at
>> all. It seems we could get away with marking operand 0 early-clobber and
>> using it as our scratch register. Then we could drop all the extra
>> infrastructure from this patch.
>
> Hah, good point. Ok, I attach a revised patch, updating the pattern as you
> suggest and omitting the infrastructure changes. (I'm leaving the now-unused
> qualifier_internal as it stands, as there are arguments both for removing it
> and "fixing" it as per previous patch, but I don't think we should hold this
> up in the case that we want to have that discussion!)
>
> Cross-tested check-gcc on aarch64-none-elf, including previously-posted test
> of vabs_s8.
>
> Ok for trunk (with previously-posted testcase) ?
> --Alan
>
> gcc/ChangeLog:
>
>         * config/aarch64/aarch64.md (absdi2): Remove scratch operand by
>         earlyclobbering result operand.
>
>         * config/aarch64/aarch64-builtins.c (aarch64_types_unop_qualifiers):
>         Remove final qualifier_internal.
>         (aarch64_fold_builtin): Stop folding abs builtins, except on floats.

OK, and the test case too /Marcus
diff mbox

Patch

diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c
index c0881e643440dd688570bcbc2a849ae2d441225a..7a444053fc1aeb4f7d2764b7fd50a39fe0e4f83b 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -126,11 +126,9 @@  typedef struct
   enum aarch64_type_qualifiers *qualifiers;
 } aarch64_simd_builtin_datum;
 
-/*  The qualifier_internal allows generation of a unary builtin from
-    a pattern with a third pseudo-operand such as a match_scratch.  */
 static enum aarch64_type_qualifiers
 aarch64_types_unop_qualifiers[SIMD_MAX_BUILTIN_ARGS]
-  = { qualifier_none, qualifier_none, qualifier_internal };
+  = { qualifier_none, qualifier_none };
 #define TYPES_UNOP (aarch64_types_unop_qualifiers)
 static enum aarch64_type_qualifiers
 aarch64_types_unopu_qualifiers[SIMD_MAX_BUILTIN_ARGS]
@@ -1275,7 +1273,7 @@  aarch64_fold_builtin (tree fndecl, int n_args ATTRIBUTE_UNUSED, tree *args,
 
   switch (fcode)
     {
-      BUILTIN_VALLDI (UNOP, abs, 2)
+      BUILTIN_VDQF (UNOP, abs, 2)
 	return fold_build1 (ABS_EXPR, type, args[0]);
 	break;
       VAR1 (UNOP, floatv2si, 2, v2sf)
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 17570ba026b45df097f8838751eed86d7ce03609..25ad10405a88ba6e629bb2bf671041a639876ff2 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1957,9 +1957,8 @@ 
 )
 
 (define_insn_and_split "absdi2"
-  [(set (match_operand:DI 0 "register_operand" "=r,w")
-	(abs:DI (match_operand:DI 1 "register_operand" "r,w")))
-   (clobber (match_scratch:DI 2 "=&r,X"))]
+  [(set (match_operand:DI 0 "register_operand" "=&r,w")
+	(abs:DI (match_operand:DI 1 "register_operand" "r,w")))]
   ""
   "@
    #
@@ -1969,7 +1968,7 @@ 
    && GP_REGNUM_P (REGNO (operands[1]))"
   [(const_int 0)]
   {
-    emit_insn (gen_rtx_SET (VOIDmode, operands[2],
+    emit_insn (gen_rtx_SET (VOIDmode, operands[0],
 			    gen_rtx_XOR (DImode,
 					 gen_rtx_ASHIFTRT (DImode,
 							   operands[1],
@@ -1978,7 +1977,7 @@ 
     emit_insn (gen_rtx_SET (VOIDmode,
 			    operands[0],
 			    gen_rtx_MINUS (DImode,
-					   operands[2],
+					   operands[0],
 					   gen_rtx_ASHIFTRT (DImode,
 							     operands[1],
 							     GEN_INT (63)))));