Message ID | 5475F813.9090001@arm.com |
---|---|
State | New |
Headers | show |
On Wed, Nov 26, 2014 at 03:56:03PM +0000, Alan Lawrence wrote: > So in case there's any confusion about the behaviour expected of *the vabs > intrinsic*, here's a testcase (failing without patch, passing with it)... > > --Alan > > Alan Lawrence wrote: > > ...as the former is defined as returning MIN_VALUE for argument MIN_VALUE, > > whereas the latter is 'undefined', and gcc can optimize "abs(x)>=0" to "true", > > which is wrong for __builtin_aarch64_abs. > > > > There has been much debate here, although not recently - I think the last was > > https://gcc.gnu.org/ml/gcc-patches/2014-05/msg00387.html . However without a > > definite solution, we should at least stop the folding, as otherwise this is a bug. Well, I don't see myself getting round to looking at the solutions proposed in that thread any time soon, and as you say this is a longstanding bug in our intrinsics implementation. So I'm in favour of that part of your patch going in, though I have some comments: > @@ -1317,9 +1322,6 @@ aarch64_fold_builtin (tree fndecl, int n_args ATTRIBUTE_UNUSED, tree *args, > > switch (fcode) > { > - BUILTIN_VALLDI (UNOP, abs, 2) > - return fold_build1 (ABS_EXPR, type, args[0]); > - break; > BUILTIN_VALLDI (BINOP, cmge, 0) > return fold_build2 (GE_EXPR, type, args[0], args[1]); > break; Why do we want to turn off folding for the V4SF/V2SF/V2DF modes of these intrinsics? There should be no difference between the mid-end definition and the intrinsic definition of their behaviour. I also note that the integer forms of these now end up as an "abs" RTL expression - can we guarantee that preserves the intrinsics behaviour and no RTL-folder will come along and mis-optimize? Documentation is vague on this point. I'm also not convinced by the SIMD_ARG_SCRATCH foo you add. Looking at the aarch64.md:absdi2 pattern I can't see why we need that scratch at all. It seems we could get away with marking operand 0 early-clobber and using it as our scratch register. Then we could drop all the extra infrastructure from this patch. Thanks, James > > > > The complication here is that the folding meant we never expanded the call to > > __builtin_aarch64_absdi, and thus never realized that expanding would cause an > > ICE: gen_absdi2() takes only two parameters (source and dest operand), and has > > no parameter corresponding to the scratch operand in the insn_data; but the code > > in aarch64_simd_expand_args, reads the number of arguments from the insn_data, > > and so tries to get another operand from the call to __builtin_aarch64_absdi, > > causing the ICE. Hence, we have to introduce a SIMD_ARG_SCRATCH, used in > > aarch64_simd_expand_args to skip over the argument. > > > > (There is an alternative way of doing this, i.e. to update the for-loop in > > aarch64_simd_expand_builtin by adding yet another version of 'k'. However, I > > thought there were enough versions already that I really didn't want to add one > > more...) > > > > To go with this, I've renamed qualifier_internal to qualifier_scratch, as it's > > not clear that any non-scratch internal operands (whatever such might be) would > > want the same treatment! > > > > Cross-tested check-gcc on aarch64-none-elf. > > > > Ok for trunk? > > > > gcc/ChangeLog: > > > > * config/aarch64/aarch64-builtins.c (enum aarch64_type_qualifiers): > > Rename qualifier_internal to qualifier_scratch. > > (aarch64_types_unop_qualifiers, aarch64_init_simd_builtins): Follow > > renaming. > > (builtin_simd_arg): New SIMD_ARG_SCRATCH enum value. > > (aarch64_simd_expand_args): Skip over SIMD_ARG_SCRATCHes. > > (aarch64_simd_expand_builtin): Handle qualifier_scratch. > > (aarch64_fold_builtin): Remove folding of abs. > diff --git a/gcc/testsuite/gcc.target/aarch64/vabs_intrinsic_2.c b/gcc/testsuite/gcc.target/aarch64/vabs_intrinsic_2.c > new file mode 100644 > index 0000000000000000000000000000000000000000..12fdd813bc3f58a183b4986c5c9c532c2b608699 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/vabs_intrinsic_2.c > @@ -0,0 +1,17 @@ > +/* { dg-do run } */ > +/* { dg-options "-O2" } */ > + > +#include <arm_neon.h> > + > +extern void abort (void); > + > +int > +main (int argc, char **argv) > +{ > + int8x8_t a = vabs_s8 (vdup_n_s8 (-128)); /* Should all be -128. */ > + uint8x8_t b = vcltz_s8 (a); /* Should all be true i.e. -1. */ > + if (vget_lane_u8 (b, 1)) > + return 0; > + abort (); > +} > +
diff --git a/gcc/testsuite/gcc.target/aarch64/vabs_intrinsic_2.c b/gcc/testsuite/gcc.target/aarch64/vabs_intrinsic_2.c new file mode 100644 index 0000000000000000000000000000000000000000..12fdd813bc3f58a183b4986c5c9c532c2b608699 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/vabs_intrinsic_2.c @@ -0,0 +1,17 @@ +/* { dg-do run } */ +/* { dg-options "-O2" } */ + +#include <arm_neon.h> + +extern void abort (void); + +int +main (int argc, char **argv) +{ + int8x8_t a = vabs_s8 (vdup_n_s8 (-128)); /* Should all be -128. */ + uint8x8_t b = vcltz_s8 (a); /* Should all be true i.e. -1. */ + if (vget_lane_u8 (b, 1)) + return 0; + abort (); +} +