Message ID | 58F9C40C.3080502@foss.arm.com |
---|---|
State | New |
Headers | show |
Ping. https://gcc.gnu.org/ml/gcc-patches/2017-04/msg00933.html Thanks, Kyrill On 21/04/17 09:34, Kyrill Tkachov wrote: > Hi all, > > For the testcase in the patch we currently miss a combination and generate: > foo: > dup h1, v1.h[2] > ins v0.h[3], v1.h[0] > ret > > bar: > dup h1, v1.h[2] > ins v0.h[3], v1.h[0] > ret > > This is because the *aarch64_simd_vec_copy_lane<mode> pattern is not defined > for HF vector modes. I think that's just a simple oversight fixed by using > the VALL_F16 mode iterator instead of VALL (it just adds V4HF and V8HF on top of VALL) > and we can use the proper INS pattern and generate: > foo: > ins v0.h[3], v1.h[2] > ret > > bar: > ins v0.h[3], v1.h[2] > ret > > Bootstrapped and tested on aarch64-none-linux-gnu. > Ok for GCC 8? > > Thanks, > Kyrill > > 2017-04-21 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > * config/aarch64/aarch64-simd.md (*aarch64_simd_vec_copy_lane<mode>): > Use VALL_F16 iterator rather than VALL. > > 2017-04-21 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > * gcc.target/aarch64/hfmode_ins_1.c: New test. >
Ping. Thanks, Kyrill On 11/05/17 11:15, Kyrill Tkachov wrote: > Ping. > > https://gcc.gnu.org/ml/gcc-patches/2017-04/msg00933.html > > Thanks, > Kyrill > > On 21/04/17 09:34, Kyrill Tkachov wrote: >> Hi all, >> >> For the testcase in the patch we currently miss a combination and generate: >> foo: >> dup h1, v1.h[2] >> ins v0.h[3], v1.h[0] >> ret >> >> bar: >> dup h1, v1.h[2] >> ins v0.h[3], v1.h[0] >> ret >> >> This is because the *aarch64_simd_vec_copy_lane<mode> pattern is not defined >> for HF vector modes. I think that's just a simple oversight fixed by using >> the VALL_F16 mode iterator instead of VALL (it just adds V4HF and V8HF on top of VALL) >> and we can use the proper INS pattern and generate: >> foo: >> ins v0.h[3], v1.h[2] >> ret >> >> bar: >> ins v0.h[3], v1.h[2] >> ret >> >> Bootstrapped and tested on aarch64-none-linux-gnu. >> Ok for GCC 8? >> >> Thanks, >> Kyrill >> >> 2017-04-21 Kyrylo Tkachov <kyrylo.tkachov@arm.com> >> >> * config/aarch64/aarch64-simd.md (*aarch64_simd_vec_copy_lane<mode>): >> Use VALL_F16 iterator rather than VALL. >> >> 2017-04-21 Kyrylo Tkachov <kyrylo.tkachov@arm.com> >> >> * gcc.target/aarch64/hfmode_ins_1.c: New test. >> >
On Fri, Apr 21, 2017 at 09:34:20AM +0100, Kyrill Tkachov wrote: > Hi all, > > For the testcase in the patch we currently miss a combination and generate: > foo: > dup h1, v1.h[2] > ins v0.h[3], v1.h[0] > ret > > bar: > dup h1, v1.h[2] > ins v0.h[3], v1.h[0] > ret > > This is because the *aarch64_simd_vec_copy_lane<mode> pattern is not defined > for HF vector modes. I think that's just a simple oversight fixed by using > the VALL_F16 mode iterator instead of VALL (it just adds V4HF and V8HF on top of VALL) > and we can use the proper INS pattern and generate: > foo: > ins v0.h[3], v1.h[2] > ret > > bar: > ins v0.h[3], v1.h[2] > ret > > Bootstrapped and tested on aarch64-none-linux-gnu. > Ok for GCC 8? Yes, this is OK. Thanks, James > > Thanks, > Kyrill > > 2017-04-21 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > * config/aarch64/aarch64-simd.md (*aarch64_simd_vec_copy_lane<mode>): > Use VALL_F16 iterator rather than VALL. > > 2017-04-21 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > * gcc.target/aarch64/hfmode_ins_1.c: New test. >
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 7ad3a76c8fa8bc28b8e0c6314958be7dfcf43457..3eeb54bdd512c729f43f3a19ebb0e58567767d20 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -565,14 +565,14 @@ (define_insn "aarch64_simd_vec_set<mode>" ) (define_insn "*aarch64_simd_vec_copy_lane<mode>" - [(set (match_operand:VALL 0 "register_operand" "=w") - (vec_merge:VALL - (vec_duplicate:VALL + [(set (match_operand:VALL_F16 0 "register_operand" "=w") + (vec_merge:VALL_F16 + (vec_duplicate:VALL_F16 (vec_select:<VEL> - (match_operand:VALL 3 "register_operand" "w") + (match_operand:VALL_F16 3 "register_operand" "w") (parallel [(match_operand:SI 4 "immediate_operand" "i")]))) - (match_operand:VALL 1 "register_operand" "0") + (match_operand:VALL_F16 1 "register_operand" "0") (match_operand:SI 2 "immediate_operand" "i")))] "TARGET_SIMD" { diff --git a/gcc/testsuite/gcc.target/aarch64/hfmode_ins_1.c b/gcc/testsuite/gcc.target/aarch64/hfmode_ins_1.c new file mode 100644 index 0000000000000000000000000000000000000000..7fafe92f49042b64d24ad4d5219251645da3abfd --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/hfmode_ins_1.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +/* Check that we can perform this in a single INS without doing any DUPs. */ + +#include <arm_neon.h> + +float16x8_t +foo (float16x8_t a, float16x8_t b) +{ + return vsetq_lane_f16 (vgetq_lane_f16 (b, 2), a, 3); +} + +float16x4_t +bar (float16x4_t a, float16x4_t b) +{ + return vset_lane_f16 (vget_lane_f16 (b, 2), a, 3); +} + +/* { dg-final { scan-assembler-times "ins\\t" 2 } } */ +/* { dg-final { scan-assembler-not "dup\\t" } } */