diff mbox

Ping with testcase: [PATCH][AArch64] Fix __builtin_aarch64_absdi, must not fold to ABS_EXPR

Message ID 5475F813.9090001@arm.com
State New
Headers show

Commit Message

Alan Lawrence Nov. 26, 2014, 3:56 p.m. UTC
So in case there's any confusion about the behaviour expected of *the vabs 
intrinsic*, here's a testcase (failing without patch, passing with it)...

--Alan

Alan Lawrence wrote:
> ...as the former is defined as returning MIN_VALUE for argument MIN_VALUE, 
> whereas the latter is 'undefined', and gcc can optimize "abs(x)>=0" to "true", 
> which is wrong for __builtin_aarch64_abs.
> 
> There has been much debate here, although not recently - I think the last was 
> https://gcc.gnu.org/ml/gcc-patches/2014-05/msg00387.html . However without a 
> definite solution, we should at least stop the folding, as otherwise this is a bug.
> 
> The complication here is that the folding meant we never expanded the call to 
> __builtin_aarch64_absdi, and thus never realized that expanding would cause an 
> ICE: gen_absdi2() takes only two parameters (source and dest operand), and has 
> no parameter corresponding to the scratch operand in the insn_data; but the code 
> in aarch64_simd_expand_args, reads the number of arguments from the insn_data, 
> and so tries to get another operand from the call to __builtin_aarch64_absdi, 
> causing the ICE. Hence, we have to introduce a SIMD_ARG_SCRATCH, used in 
> aarch64_simd_expand_args to skip over the argument.
> 
> (There is an alternative way of doing this, i.e. to update the for-loop in 
> aarch64_simd_expand_builtin by adding yet another version of 'k'. However, I 
> thought there were enough versions already that I really didn't want to add one 
> more...)
> 
> To go with this, I've renamed qualifier_internal to qualifier_scratch, as it's 
> not clear that any non-scratch internal operands (whatever such might be) would 
> want the same treatment!
> 
> Cross-tested check-gcc on aarch64-none-elf.
> 
> Ok for trunk?
> 
> gcc/ChangeLog:
> 
> 	* config/aarch64/aarch64-builtins.c (enum aarch64_type_qualifiers):
> 	Rename qualifier_internal to qualifier_scratch.
> 	(aarch64_types_unop_qualifiers, aarch64_init_simd_builtins): Follow
> 	renaming.
> 	(builtin_simd_arg): New SIMD_ARG_SCRATCH enum value.
> 	(aarch64_simd_expand_args): Skip over SIMD_ARG_SCRATCHes.
> 	(aarch64_simd_expand_builtin): Handle qualifier_scratch.
> 	(aarch64_fold_builtin): Remove folding of abs.

Comments

James Greenhalgh Nov. 26, 2014, 4:35 p.m. UTC | #1
On Wed, Nov 26, 2014 at 03:56:03PM +0000, Alan Lawrence wrote:
> So in case there's any confusion about the behaviour expected of *the vabs 
> intrinsic*, here's a testcase (failing without patch, passing with it)...
> 
> --Alan
> 
> Alan Lawrence wrote:
> > ...as the former is defined as returning MIN_VALUE for argument MIN_VALUE, 
> > whereas the latter is 'undefined', and gcc can optimize "abs(x)>=0" to "true", 
> > which is wrong for __builtin_aarch64_abs.
> > 
> > There has been much debate here, although not recently - I think the last was 
> > https://gcc.gnu.org/ml/gcc-patches/2014-05/msg00387.html . However without a 
> > definite solution, we should at least stop the folding, as otherwise this is a bug.


Well, I don't see myself getting round to looking at the solutions
proposed in that thread any time soon, and as you say this is a longstanding
bug in our intrinsics implementation. So I'm in favour of that part of
your patch going in, though I have some comments:

> @@ -1317,9 +1322,6 @@ aarch64_fold_builtin (tree fndecl, int n_args ATTRIBUTE_UNUSED, tree *args,
>  
>    switch (fcode)
>      {
> -      BUILTIN_VALLDI (UNOP, abs, 2)
> -	return fold_build1 (ABS_EXPR, type, args[0]);
> -	break;
>        BUILTIN_VALLDI (BINOP, cmge, 0)
>  	return fold_build2 (GE_EXPR, type, args[0], args[1]);
>  	break;

Why do we want to turn off folding for the V4SF/V2SF/V2DF modes of these
intrinsics? There should be no difference between the mid-end definition
and the intrinsic definition of their behaviour.

I also note that the integer forms of these now end up as an "abs" RTL
expression - can we guarantee that preserves the intrinsics behaviour and
no RTL-folder will come along and mis-optimize? Documentation is vague
on this point.

I'm also not convinced by the SIMD_ARG_SCRATCH foo you add. Looking at
the aarch64.md:absdi2 pattern I can't see why we need that scratch at
all. It seems we could get away with marking operand 0 early-clobber and
using it as our scratch register. Then we could drop all the extra
infrastructure from this patch.

Thanks,
James


> > 
> > The complication here is that the folding meant we never expanded the call to 
> > __builtin_aarch64_absdi, and thus never realized that expanding would cause an 
> > ICE: gen_absdi2() takes only two parameters (source and dest operand), and has 
> > no parameter corresponding to the scratch operand in the insn_data; but the code 
> > in aarch64_simd_expand_args, reads the number of arguments from the insn_data, 
> > and so tries to get another operand from the call to __builtin_aarch64_absdi, 
> > causing the ICE. Hence, we have to introduce a SIMD_ARG_SCRATCH, used in 
> > aarch64_simd_expand_args to skip over the argument.
> > 
> > (There is an alternative way of doing this, i.e. to update the for-loop in 
> > aarch64_simd_expand_builtin by adding yet another version of 'k'. However, I 
> > thought there were enough versions already that I really didn't want to add one 
> > more...)
> > 
> > To go with this, I've renamed qualifier_internal to qualifier_scratch, as it's 
> > not clear that any non-scratch internal operands (whatever such might be) would 
> > want the same treatment!
> > 
> > Cross-tested check-gcc on aarch64-none-elf.
> > 
> > Ok for trunk?
> > 
> > gcc/ChangeLog:
> > 
> > 	* config/aarch64/aarch64-builtins.c (enum aarch64_type_qualifiers):
> > 	Rename qualifier_internal to qualifier_scratch.
> > 	(aarch64_types_unop_qualifiers, aarch64_init_simd_builtins): Follow
> > 	renaming.
> > 	(builtin_simd_arg): New SIMD_ARG_SCRATCH enum value.
> > 	(aarch64_simd_expand_args): Skip over SIMD_ARG_SCRATCHes.
> > 	(aarch64_simd_expand_builtin): Handle qualifier_scratch.
> > 	(aarch64_fold_builtin): Remove folding of abs.

> diff --git a/gcc/testsuite/gcc.target/aarch64/vabs_intrinsic_2.c b/gcc/testsuite/gcc.target/aarch64/vabs_intrinsic_2.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..12fdd813bc3f58a183b4986c5c9c532c2b608699
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/vabs_intrinsic_2.c
> @@ -0,0 +1,17 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2" } */
> +
> +#include <arm_neon.h>
> +
> +extern void abort (void);
> +
> +int
> +main (int argc, char **argv)
> +{
> +  int8x8_t a = vabs_s8 (vdup_n_s8 (-128)); /* Should all be -128.  */
> +  uint8x8_t b = vcltz_s8 (a); /* Should all be true i.e. -1. */
> +  if (vget_lane_u8 (b, 1))
> +    return 0;
> +  abort ();
> +}
> +
diff mbox

Patch

diff --git a/gcc/testsuite/gcc.target/aarch64/vabs_intrinsic_2.c b/gcc/testsuite/gcc.target/aarch64/vabs_intrinsic_2.c
new file mode 100644
index 0000000000000000000000000000000000000000..12fdd813bc3f58a183b4986c5c9c532c2b608699
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/vabs_intrinsic_2.c
@@ -0,0 +1,17 @@ 
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+#include <arm_neon.h>
+
+extern void abort (void);
+
+int
+main (int argc, char **argv)
+{
+  int8x8_t a = vabs_s8 (vdup_n_s8 (-128)); /* Should all be -128.  */
+  uint8x8_t b = vcltz_s8 (a); /* Should all be true i.e. -1. */
+  if (vget_lane_u8 (b, 1))
+    return 0;
+  abort ();
+}
+