Message ID | 1365589912-3344-1-git-send-email-james.greenhalgh@arm.com |
---|---|
State | New |
Headers | show |
On 10/04/13 11:31, James Greenhalgh wrote: > >> -----Original Message----- >> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches- >> owner@gcc.gnu.org] On Behalf Of Zhenqiang Chen >> Sent: 10 April 2013 09:02 >> To: gcc-patches@gcc.gnu.org >> Cc: Marcus Shawcroft >> Subject: [PATCH, AARCH64] Fix unrecognizable insn issue >> >> Hi, >> >> During expand, function aarch64_vcond_internal inverses some CMP, e.g. >> >> a LE b -> b GE a >> >> But if "b" is "CONST0_RTX", "b GE a" will be an illegal insn. > > Yes it will. We should not be swapping the comparison in these cases. > >> >> Refer https://bugs.launchpad.net/linaro-toolchain-binaries/+bug/1163942 >> for detail about the issue. >> >> The patch is to make "b" a register when inversing LE. > > This patch is too restrictive. There is an `fcmle v0.2d #0` form which we > should be generating when we can. Also, you are only fixing one problematic > case where there are a few. > > I don't have access to your reproducer, so I can't be certain this patch > is correct - I have created my own reproducer and added it in with > the other vect-fcm tests. > > Thorough regression tests are ongoing for this patch, but it > passes aarch64.exp and vect.exp with no regressions. > > Thanks, > James > > --- > gcc/ > > 2013-04-10 James Greenhalgh <james.greenhalgh@arm.com> > > * config/aarch64/aarch64-simd.md (aarch64_vcond_internal): Fix > floating-point vector comparisons against 0. > > gcc/testsuite/ > > 2013-04-10 James Greenhalgh <james.greenhalgh@arm.com> > > * gcc.target/aarch64/vect-fcm.x: Add check for zero forms of > inverse operands. > * gcc.target/aarch64/vect-fcm-eq-d.c: Check that new zero form > loop is vectorized. > * gcc.target/aarch64/vect-fcm-eq-f.c: Likewise. > * gcc.target/aarch64/vect-fcm-ge-d.c: Check that new zero form > loop is vectorized and that the correct instruction is generated. > * gcc.target/aarch64/vect-fcm-ge-f.c: Likewise. > * gcc.target/aarch64/vect-fcm-gt-d.c: Likewise. > * gcc.target/aarch64/vect-fcm-gt-f.c: Likewise. > > OK. R.
Zhenqiang, Does Jame's patch fix your test case? /Marcus On 10 April 2013 11:43, Richard Earnshaw <rearnsha@arm.com> wrote: > On 10/04/13 11:31, James Greenhalgh wrote: >> >> >>> -----Original Message----- >>> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches- >>> owner@gcc.gnu.org] On Behalf Of Zhenqiang Chen >>> Sent: 10 April 2013 09:02 >>> To: gcc-patches@gcc.gnu.org >>> Cc: Marcus Shawcroft >>> Subject: [PATCH, AARCH64] Fix unrecognizable insn issue >>> >>> Hi, >>> >>> During expand, function aarch64_vcond_internal inverses some CMP, e.g. >>> >>> a LE b -> b GE a >>> >>> But if "b" is "CONST0_RTX", "b GE a" will be an illegal insn. >> >> >> Yes it will. We should not be swapping the comparison in these cases. >> >>> >>> Refer https://bugs.launchpad.net/linaro-toolchain-binaries/+bug/1163942 >>> for detail about the issue. >>> >>> The patch is to make "b" a register when inversing LE. >> >> >> This patch is too restrictive. There is an `fcmle v0.2d #0` form which we >> should be generating when we can. Also, you are only fixing one >> problematic >> case where there are a few. >> >> I don't have access to your reproducer, so I can't be certain this patch >> is correct - I have created my own reproducer and added it in with >> the other vect-fcm tests. >> >> Thorough regression tests are ongoing for this patch, but it >> passes aarch64.exp and vect.exp with no regressions. >> >> Thanks, >> James >> >> --- >> gcc/ >> >> 2013-04-10 James Greenhalgh <james.greenhalgh@arm.com> >> >> * config/aarch64/aarch64-simd.md (aarch64_vcond_internal): Fix >> floating-point vector comparisons against 0. >> >> gcc/testsuite/ >> >> 2013-04-10 James Greenhalgh <james.greenhalgh@arm.com> >> >> * gcc.target/aarch64/vect-fcm.x: Add check for zero forms of >> inverse operands. >> * gcc.target/aarch64/vect-fcm-eq-d.c: Check that new zero form >> loop is vectorized. >> * gcc.target/aarch64/vect-fcm-eq-f.c: Likewise. >> * gcc.target/aarch64/vect-fcm-ge-d.c: Check that new zero form >> loop is vectorized and that the correct instruction is generated. >> * gcc.target/aarch64/vect-fcm-ge-f.c: Likewise. >> * gcc.target/aarch64/vect-fcm-gt-d.c: Likewise. >> * gcc.target/aarch64/vect-fcm-gt-f.c: Likewise. >> >> > > OK. > > R. > >
On 10 April 2013 18:48, Marcus Shawcroft <marcus.shawcroft@gmail.com> wrote: > Zhenqiang, Does Jame's patch fix your test case? Thank you all. The patch fixes my test case. -Zhenqiang > On 10 April 2013 11:43, Richard Earnshaw <rearnsha@arm.com> wrote: >> On 10/04/13 11:31, James Greenhalgh wrote: >>> >>> >>>> -----Original Message----- >>>> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches- >>>> owner@gcc.gnu.org] On Behalf Of Zhenqiang Chen >>>> Sent: 10 April 2013 09:02 >>>> To: gcc-patches@gcc.gnu.org >>>> Cc: Marcus Shawcroft >>>> Subject: [PATCH, AARCH64] Fix unrecognizable insn issue >>>> >>>> Hi, >>>> >>>> During expand, function aarch64_vcond_internal inverses some CMP, e.g. >>>> >>>> a LE b -> b GE a >>>> >>>> But if "b" is "CONST0_RTX", "b GE a" will be an illegal insn. >>> >>> >>> Yes it will. We should not be swapping the comparison in these cases. >>> >>>> >>>> Refer https://bugs.launchpad.net/linaro-toolchain-binaries/+bug/1163942 >>>> for detail about the issue. >>>> >>>> The patch is to make "b" a register when inversing LE. >>> >>> >>> This patch is too restrictive. There is an `fcmle v0.2d #0` form which we >>> should be generating when we can. Also, you are only fixing one >>> problematic >>> case where there are a few. >>> >>> I don't have access to your reproducer, so I can't be certain this patch >>> is correct - I have created my own reproducer and added it in with >>> the other vect-fcm tests. >>> >>> Thorough regression tests are ongoing for this patch, but it >>> passes aarch64.exp and vect.exp with no regressions. >>> >>> Thanks, >>> James >>> >>> --- >>> gcc/ >>> >>> 2013-04-10 James Greenhalgh <james.greenhalgh@arm.com> >>> >>> * config/aarch64/aarch64-simd.md (aarch64_vcond_internal): Fix >>> floating-point vector comparisons against 0. >>> >>> gcc/testsuite/ >>> >>> 2013-04-10 James Greenhalgh <james.greenhalgh@arm.com> >>> >>> * gcc.target/aarch64/vect-fcm.x: Add check for zero forms of >>> inverse operands. >>> * gcc.target/aarch64/vect-fcm-eq-d.c: Check that new zero form >>> loop is vectorized. >>> * gcc.target/aarch64/vect-fcm-eq-f.c: Likewise. >>> * gcc.target/aarch64/vect-fcm-ge-d.c: Check that new zero form >>> loop is vectorized and that the correct instruction is generated. >>> * gcc.target/aarch64/vect-fcm-ge-f.c: Likewise. >>> * gcc.target/aarch64/vect-fcm-gt-d.c: Likewise. >>> * gcc.target/aarch64/vect-fcm-gt-f.c: Likewise. >>> >>> >> >> OK. >> >> R. >> >>
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index ad3f4a4..9b42365 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -1622,6 +1622,7 @@ "TARGET_SIMD" { int inverse = 0; + int use_zero_form = 0; int swap_bsl_operands = 0; rtx mask = gen_reg_rtx (<V_cmp_result>mode); rtx tmp = gen_reg_rtx (<V_cmp_result>mode); @@ -1632,12 +1633,16 @@ switch (GET_CODE (operands[3])) { case GE: + case GT: case LE: + case LT: case EQ: - if (!REG_P (operands[5]) - && (operands[5] != CONST0_RTX (<MODE>mode))) - operands[5] = force_reg (<MODE>mode, operands[5]); - break; + if (operands[5] == CONST0_RTX (<MODE>mode)) + { + use_zero_form = 1; + break; + } + /* Fall through. */ default: if (!REG_P (operands[5])) operands[5] = force_reg (<MODE>mode, operands[5]); @@ -1688,7 +1693,26 @@ a GT b -> a GT b a LE b -> b GE a a LT b -> b GT a - a EQ b -> a EQ b */ + a EQ b -> a EQ b + Note that there also exist direct comparison against 0 forms, + so catch those as a special case. */ + if (use_zero_form) + { + inverse = 0; + switch (GET_CODE (operands[3])) + { + case LT: + base_comparison = gen_aarch64_cmlt<mode>; + break; + case LE: + base_comparison = gen_aarch64_cmle<mode>; + break; + default: + /* Do nothing, other zero form cases already have the correct + base_comparison. */ + break; + } + } if (!inverse) emit_insn (base_comparison (mask, operands[4], operands[5])); diff --git a/gcc/testsuite/gcc.target/aarch64/vect-fcm-eq-d.c b/gcc/testsuite/gcc.target/aarch64/vect-fcm-eq-d.c index b6fb5ae..19ecd63 100644 --- a/gcc/testsuite/gcc.target/aarch64/vect-fcm-eq-d.c +++ b/gcc/testsuite/gcc.target/aarch64/vect-fcm-eq-d.c @@ -7,7 +7,7 @@ #include "vect-fcm.x" -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 3 "vect" } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 4 "vect" } } */ /* { dg-final { scan-assembler "fcmeq\\tv\[0-9\]+\.2d, v\[0-9\]+\.2d, v\[0-9\]+\.2d" } } */ /* { dg-final { scan-assembler "fcmeq\\tv\[0-9\]+\.2d, v\[0-9\]+\.2d, 0" } } */ /* { dg-final { cleanup-tree-dump "vect" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/vect-fcm-eq-f.c b/gcc/testsuite/gcc.target/aarch64/vect-fcm-eq-f.c index 283d34f..30be5ad 100644 --- a/gcc/testsuite/gcc.target/aarch64/vect-fcm-eq-f.c +++ b/gcc/testsuite/gcc.target/aarch64/vect-fcm-eq-f.c @@ -7,7 +7,7 @@ #include "vect-fcm.x" -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 3 "vect" } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 4 "vect" } } */ /* { dg-final { scan-assembler "fcmeq\\tv\[0-9\]+\.\[24\]s, v\[0-9\]+\.\[24\]s, v\[0-9\]+\.\[24\]s" } } */ /* { dg-final { scan-assembler "fcmeq\\tv\[0-9\]+\.\[24\]s, v\[0-9\]+\.\[24\]s, 0" } } */ /* { dg-final { cleanup-tree-dump "vect" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/vect-fcm-ge-d.c b/gcc/testsuite/gcc.target/aarch64/vect-fcm-ge-d.c index 868e1f8..b922833 100644 --- a/gcc/testsuite/gcc.target/aarch64/vect-fcm-ge-d.c +++ b/gcc/testsuite/gcc.target/aarch64/vect-fcm-ge-d.c @@ -7,8 +7,9 @@ #include "vect-fcm.x" -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 3 "vect" } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 4 "vect" } } */ /* { dg-final { scan-assembler "fcmge\\tv\[0-9\]+\.2d, v\[0-9\]+\.2d, v\[0-9\]+\.2d" } } */ /* { dg-final { scan-assembler "fcmge\\tv\[0-9\]+\.2d, v\[0-9\]+\.2d, 0" } } */ +/* { dg-final { scan-assembler "fcmlt\\tv\[0-9\]+\.2d, v\[0-9\]+\.2d, 0" } } */ /* { dg-final { cleanup-tree-dump "vect" } } */ /* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/vect-fcm-ge-f.c b/gcc/testsuite/gcc.target/aarch64/vect-fcm-ge-f.c index e3258f3..04d3533 100644 --- a/gcc/testsuite/gcc.target/aarch64/vect-fcm-ge-f.c +++ b/gcc/testsuite/gcc.target/aarch64/vect-fcm-ge-f.c @@ -7,8 +7,9 @@ #include "vect-fcm.x" -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 3 "vect" } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 4 "vect" } } */ /* { dg-final { scan-assembler "fcmge\\tv\[0-9\]+\.\[24\]s, v\[0-9\]+\.\[24\]s, v\[0-9\]+\.\[24\]s" } } */ /* { dg-final { scan-assembler "fcmge\\tv\[0-9\]+\.\[24\]s, v\[0-9\]+\.\[24\]s, 0" } } */ +/* { dg-final { scan-assembler "fcmlt\\tv\[0-9\]+\.\[24\]s, v\[0-9\]+\.\[24\]s, 0" } } */ /* { dg-final { cleanup-tree-dump "vect" } } */ /* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/vect-fcm-gt-d.c b/gcc/testsuite/gcc.target/aarch64/vect-fcm-gt-d.c index ed8b452..421a04a 100644 --- a/gcc/testsuite/gcc.target/aarch64/vect-fcm-gt-d.c +++ b/gcc/testsuite/gcc.target/aarch64/vect-fcm-gt-d.c @@ -7,8 +7,9 @@ #include "vect-fcm.x" -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 3 "vect" } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 4 "vect" } } */ /* { dg-final { scan-assembler "fcmgt\\tv\[0-9\]+\.2d, v\[0-9\]+\.2d, v\[0-9\]+\.2d" } } */ /* { dg-final { scan-assembler "fcmgt\\tv\[0-9\]+\.2d, v\[0-9\]+\.2d, 0" } } */ +/* { dg-final { scan-assembler "fcmle\\tv\[0-9\]+\.2d, v\[0-9\]+\.2d, 0" } } */ /* { dg-final { cleanup-tree-dump "vect" } } */ /* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/vect-fcm-gt-f.c b/gcc/testsuite/gcc.target/aarch64/vect-fcm-gt-f.c index e90a875..cdeab14 100644 --- a/gcc/testsuite/gcc.target/aarch64/vect-fcm-gt-f.c +++ b/gcc/testsuite/gcc.target/aarch64/vect-fcm-gt-f.c @@ -7,8 +7,9 @@ #include "vect-fcm.x" -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 3 "vect" } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 4 "vect" } } */ /* { dg-final { scan-assembler "fcmgt\\tv\[0-9\]+\.\[24\]s, v\[0-9\]+\.\[24\]s, v\[0-9\]+\.\[24\]s" } } */ /* { dg-final { scan-assembler "fcmgt\\tv\[0-9\]+\.\[24\]s, v\[0-9\]+\.\[24\]s, 0" } } */ +/* { dg-final { scan-assembler "fcmle\\tv\[0-9\]+\.\[24\]s, v\[0-9\]+\.\[24\]s, 0" } } */ /* { dg-final { cleanup-tree-dump "vect" } } */ /* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/vect-fcm.x b/gcc/testsuite/gcc.target/aarch64/vect-fcm.x index 7e51bef..803861b 100644 --- a/gcc/testsuite/gcc.target/aarch64/vect-fcm.x +++ b/gcc/testsuite/gcc.target/aarch64/vect-fcm.x @@ -40,6 +40,15 @@ foobar (FTYPE *in1, FTYPE *in2, FTYPE *output) output[i] = (in1[i] OP 0.0) ? 4.0 : 2.0; } +void +foobarbar (FTYPE *in1, FTYPE *in2, FTYPE *output) +{ + int i = 0; + /* Vectorizable. */ + for (i = 0; i < N; i++) + output[i] = (in1[i] INV_OP 0.0) ? 4.0 : 2.0; +} + int main (int argc, char **argv) { @@ -51,6 +60,11 @@ main (int argc, char **argv) for (i = 0; i < N; i++) if (out1[i] != out2[i]) abort (); + foobar (input1, input2, out1); + foobarbar (input1, input2, out2); + for (i = 0; i < N; i++) + if (out1[i] == out2[i]) + abort (); return 0; }