Patchwork [AARCH64] Fix unrecognizable insn issue

login
register
mail settings
Submitter James Greenhalgh
Date April 10, 2013, 10:31 a.m.
Message ID <1365589912-3344-1-git-send-email-james.greenhalgh@arm.com>
Download mbox | patch
Permalink /patch/235350/
State New
Headers show

Comments

James Greenhalgh - April 10, 2013, 10:31 a.m.
> -----Original Message-----
> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-
> owner@gcc.gnu.org] On Behalf Of Zhenqiang Chen
> Sent: 10 April 2013 09:02
> To: gcc-patches@gcc.gnu.org
> Cc: Marcus Shawcroft
> Subject: [PATCH, AARCH64] Fix unrecognizable insn issue
>
> Hi,
>
> During expand, function aarch64_vcond_internal inverses some CMP, e.g.
>
>   a LE b -> b GE a
>
> But if "b" is "CONST0_RTX", "b GE a" will be an illegal insn.

Yes it will. We should not be swapping the comparison in these cases.

>
> Refer https://bugs.launchpad.net/linaro-toolchain-binaries/+bug/1163942
> for detail about the issue.
>
> The patch is to make "b" a register when inversing LE.

This patch is too restrictive. There is an `fcmle v0.2d #0` form which we
should be generating when we can. Also, you are only fixing one problematic
case where there are a few.

I don't have access to your reproducer, so I can't be certain this patch
is correct - I have created my own reproducer and added it in with
the other vect-fcm tests.

Thorough regression tests are ongoing for this patch, but it
passes aarch64.exp and vect.exp with no regressions.

Thanks,
James

---
gcc/

2013-04-10  James Greenhalgh  <james.greenhalgh@arm.com>

	* config/aarch64/aarch64-simd.md (aarch64_vcond_internal): Fix
	floating-point vector comparisons against 0.

gcc/testsuite/

2013-04-10  James Greenhalgh  <james.greenhalgh@arm.com>

   	* gcc.target/aarch64/vect-fcm.x: Add check for zero forms of
	inverse operands.
	* gcc.target/aarch64/vect-fcm-eq-d.c: Check that new zero form
	loop is vectorized.
   	* gcc.target/aarch64/vect-fcm-eq-f.c: Likewise.
   	* gcc.target/aarch64/vect-fcm-ge-d.c: Check that new zero form
	loop is vectorized and that the correct instruction is generated.
   	* gcc.target/aarch64/vect-fcm-ge-f.c: Likewise.
   	* gcc.target/aarch64/vect-fcm-gt-d.c: Likewise.
   	* gcc.target/aarch64/vect-fcm-gt-f.c: Likewise.
Richard Earnshaw - April 10, 2013, 10:43 a.m.
On 10/04/13 11:31, James Greenhalgh wrote:
>
>> -----Original Message-----
>> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-
>> owner@gcc.gnu.org] On Behalf Of Zhenqiang Chen
>> Sent: 10 April 2013 09:02
>> To: gcc-patches@gcc.gnu.org
>> Cc: Marcus Shawcroft
>> Subject: [PATCH, AARCH64] Fix unrecognizable insn issue
>>
>> Hi,
>>
>> During expand, function aarch64_vcond_internal inverses some CMP, e.g.
>>
>>    a LE b -> b GE a
>>
>> But if "b" is "CONST0_RTX", "b GE a" will be an illegal insn.
>
> Yes it will. We should not be swapping the comparison in these cases.
>
>>
>> Refer https://bugs.launchpad.net/linaro-toolchain-binaries/+bug/1163942
>> for detail about the issue.
>>
>> The patch is to make "b" a register when inversing LE.
>
> This patch is too restrictive. There is an `fcmle v0.2d #0` form which we
> should be generating when we can. Also, you are only fixing one problematic
> case where there are a few.
>
> I don't have access to your reproducer, so I can't be certain this patch
> is correct - I have created my own reproducer and added it in with
> the other vect-fcm tests.
>
> Thorough regression tests are ongoing for this patch, but it
> passes aarch64.exp and vect.exp with no regressions.
>
> Thanks,
> James
>
> ---
> gcc/
>
> 2013-04-10  James Greenhalgh  <james.greenhalgh@arm.com>
>
> 	* config/aarch64/aarch64-simd.md (aarch64_vcond_internal): Fix
> 	floating-point vector comparisons against 0.
>
> gcc/testsuite/
>
> 2013-04-10  James Greenhalgh  <james.greenhalgh@arm.com>
>
>     	* gcc.target/aarch64/vect-fcm.x: Add check for zero forms of
> 	inverse operands.
> 	* gcc.target/aarch64/vect-fcm-eq-d.c: Check that new zero form
> 	loop is vectorized.
>     	* gcc.target/aarch64/vect-fcm-eq-f.c: Likewise.
>     	* gcc.target/aarch64/vect-fcm-ge-d.c: Check that new zero form
> 	loop is vectorized and that the correct instruction is generated.
>     	* gcc.target/aarch64/vect-fcm-ge-f.c: Likewise.
>     	* gcc.target/aarch64/vect-fcm-gt-d.c: Likewise.
>     	* gcc.target/aarch64/vect-fcm-gt-f.c: Likewise.
>
>

OK.

R.
Marcus Shawcroft - April 10, 2013, 10:48 a.m.
Zhenqiang, Does Jame's patch fix your test case?

/Marcus

On 10 April 2013 11:43, Richard Earnshaw <rearnsha@arm.com> wrote:
> On 10/04/13 11:31, James Greenhalgh wrote:
>>
>>
>>> -----Original Message-----
>>> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-
>>> owner@gcc.gnu.org] On Behalf Of Zhenqiang Chen
>>> Sent: 10 April 2013 09:02
>>> To: gcc-patches@gcc.gnu.org
>>> Cc: Marcus Shawcroft
>>> Subject: [PATCH, AARCH64] Fix unrecognizable insn issue
>>>
>>> Hi,
>>>
>>> During expand, function aarch64_vcond_internal inverses some CMP, e.g.
>>>
>>>    a LE b -> b GE a
>>>
>>> But if "b" is "CONST0_RTX", "b GE a" will be an illegal insn.
>>
>>
>> Yes it will. We should not be swapping the comparison in these cases.
>>
>>>
>>> Refer https://bugs.launchpad.net/linaro-toolchain-binaries/+bug/1163942
>>> for detail about the issue.
>>>
>>> The patch is to make "b" a register when inversing LE.
>>
>>
>> This patch is too restrictive. There is an `fcmle v0.2d #0` form which we
>> should be generating when we can. Also, you are only fixing one
>> problematic
>> case where there are a few.
>>
>> I don't have access to your reproducer, so I can't be certain this patch
>> is correct - I have created my own reproducer and added it in with
>> the other vect-fcm tests.
>>
>> Thorough regression tests are ongoing for this patch, but it
>> passes aarch64.exp and vect.exp with no regressions.
>>
>> Thanks,
>> James
>>
>> ---
>> gcc/
>>
>> 2013-04-10  James Greenhalgh  <james.greenhalgh@arm.com>
>>
>>         * config/aarch64/aarch64-simd.md (aarch64_vcond_internal): Fix
>>         floating-point vector comparisons against 0.
>>
>> gcc/testsuite/
>>
>> 2013-04-10  James Greenhalgh  <james.greenhalgh@arm.com>
>>
>>         * gcc.target/aarch64/vect-fcm.x: Add check for zero forms of
>>         inverse operands.
>>         * gcc.target/aarch64/vect-fcm-eq-d.c: Check that new zero form
>>         loop is vectorized.
>>         * gcc.target/aarch64/vect-fcm-eq-f.c: Likewise.
>>         * gcc.target/aarch64/vect-fcm-ge-d.c: Check that new zero form
>>         loop is vectorized and that the correct instruction is generated.
>>         * gcc.target/aarch64/vect-fcm-ge-f.c: Likewise.
>>         * gcc.target/aarch64/vect-fcm-gt-d.c: Likewise.
>>         * gcc.target/aarch64/vect-fcm-gt-f.c: Likewise.
>>
>>
>
> OK.
>
> R.
>
>
Zhenqiang Chen - April 11, 2013, 1:32 a.m.
On 10 April 2013 18:48, Marcus Shawcroft <marcus.shawcroft@gmail.com> wrote:
> Zhenqiang, Does Jame's patch fix your test case?

Thank you all. The patch fixes my test case.

-Zhenqiang

> On 10 April 2013 11:43, Richard Earnshaw <rearnsha@arm.com> wrote:
>> On 10/04/13 11:31, James Greenhalgh wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-
>>>> owner@gcc.gnu.org] On Behalf Of Zhenqiang Chen
>>>> Sent: 10 April 2013 09:02
>>>> To: gcc-patches@gcc.gnu.org
>>>> Cc: Marcus Shawcroft
>>>> Subject: [PATCH, AARCH64] Fix unrecognizable insn issue
>>>>
>>>> Hi,
>>>>
>>>> During expand, function aarch64_vcond_internal inverses some CMP, e.g.
>>>>
>>>>    a LE b -> b GE a
>>>>
>>>> But if "b" is "CONST0_RTX", "b GE a" will be an illegal insn.
>>>
>>>
>>> Yes it will. We should not be swapping the comparison in these cases.
>>>
>>>>
>>>> Refer https://bugs.launchpad.net/linaro-toolchain-binaries/+bug/1163942
>>>> for detail about the issue.
>>>>
>>>> The patch is to make "b" a register when inversing LE.
>>>
>>>
>>> This patch is too restrictive. There is an `fcmle v0.2d #0` form which we
>>> should be generating when we can. Also, you are only fixing one
>>> problematic
>>> case where there are a few.
>>>
>>> I don't have access to your reproducer, so I can't be certain this patch
>>> is correct - I have created my own reproducer and added it in with
>>> the other vect-fcm tests.
>>>
>>> Thorough regression tests are ongoing for this patch, but it
>>> passes aarch64.exp and vect.exp with no regressions.
>>>
>>> Thanks,
>>> James
>>>
>>> ---
>>> gcc/
>>>
>>> 2013-04-10  James Greenhalgh  <james.greenhalgh@arm.com>
>>>
>>>         * config/aarch64/aarch64-simd.md (aarch64_vcond_internal): Fix
>>>         floating-point vector comparisons against 0.
>>>
>>> gcc/testsuite/
>>>
>>> 2013-04-10  James Greenhalgh  <james.greenhalgh@arm.com>
>>>
>>>         * gcc.target/aarch64/vect-fcm.x: Add check for zero forms of
>>>         inverse operands.
>>>         * gcc.target/aarch64/vect-fcm-eq-d.c: Check that new zero form
>>>         loop is vectorized.
>>>         * gcc.target/aarch64/vect-fcm-eq-f.c: Likewise.
>>>         * gcc.target/aarch64/vect-fcm-ge-d.c: Check that new zero form
>>>         loop is vectorized and that the correct instruction is generated.
>>>         * gcc.target/aarch64/vect-fcm-ge-f.c: Likewise.
>>>         * gcc.target/aarch64/vect-fcm-gt-d.c: Likewise.
>>>         * gcc.target/aarch64/vect-fcm-gt-f.c: Likewise.
>>>
>>>
>>
>> OK.
>>
>> R.
>>
>>

Patch

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index ad3f4a4..9b42365 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1622,6 +1622,7 @@ 
   "TARGET_SIMD"
 {
   int inverse = 0;
+  int use_zero_form = 0;
   int swap_bsl_operands = 0;
   rtx mask = gen_reg_rtx (<V_cmp_result>mode);
   rtx tmp = gen_reg_rtx (<V_cmp_result>mode);
@@ -1632,12 +1633,16 @@ 
   switch (GET_CODE (operands[3]))
     {
     case GE:
+    case GT:
     case LE:
+    case LT:
     case EQ:
-      if (!REG_P (operands[5])
-	  && (operands[5] != CONST0_RTX (<MODE>mode)))
-	operands[5] = force_reg (<MODE>mode, operands[5]);
-      break;
+      if (operands[5] == CONST0_RTX (<MODE>mode))
+	{
+	  use_zero_form = 1;
+	  break;
+	}
+      /* Fall through.  */
     default:
       if (!REG_P (operands[5]))
 	operands[5] = force_reg (<MODE>mode, operands[5]);
@@ -1688,7 +1693,26 @@ 
 	 a GT b -> a GT b
 	 a LE b -> b GE a
 	 a LT b -> b GT a
-	 a EQ b -> a EQ b  */
+	 a EQ b -> a EQ b
+	 Note that there also exist direct comparison against 0 forms,
+	 so catch those as a special case.  */
+      if (use_zero_form)
+	{
+	  inverse = 0;
+	  switch (GET_CODE (operands[3]))
+	    {
+	    case LT:
+	      base_comparison = gen_aarch64_cmlt<mode>;
+	      break;
+	    case LE:
+	      base_comparison = gen_aarch64_cmle<mode>;
+	      break;
+	    default:
+	      /* Do nothing, other zero form cases already have the correct
+		 base_comparison.  */
+	      break;
+	    }
+	}
 
       if (!inverse)
 	emit_insn (base_comparison (mask, operands[4], operands[5]));
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-fcm-eq-d.c b/gcc/testsuite/gcc.target/aarch64/vect-fcm-eq-d.c
index b6fb5ae..19ecd63 100644
--- a/gcc/testsuite/gcc.target/aarch64/vect-fcm-eq-d.c
+++ b/gcc/testsuite/gcc.target/aarch64/vect-fcm-eq-d.c
@@ -7,7 +7,7 @@ 
 
 #include "vect-fcm.x"
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 3 "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 4 "vect" } } */
 /* { dg-final { scan-assembler "fcmeq\\tv\[0-9\]+\.2d, v\[0-9\]+\.2d, v\[0-9\]+\.2d" } } */
 /* { dg-final { scan-assembler "fcmeq\\tv\[0-9\]+\.2d, v\[0-9\]+\.2d, 0" } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-fcm-eq-f.c b/gcc/testsuite/gcc.target/aarch64/vect-fcm-eq-f.c
index 283d34f..30be5ad 100644
--- a/gcc/testsuite/gcc.target/aarch64/vect-fcm-eq-f.c
+++ b/gcc/testsuite/gcc.target/aarch64/vect-fcm-eq-f.c
@@ -7,7 +7,7 @@ 
 
 #include "vect-fcm.x"
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 3 "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 4 "vect" } } */
 /* { dg-final { scan-assembler "fcmeq\\tv\[0-9\]+\.\[24\]s, v\[0-9\]+\.\[24\]s, v\[0-9\]+\.\[24\]s" } } */
 /* { dg-final { scan-assembler "fcmeq\\tv\[0-9\]+\.\[24\]s, v\[0-9\]+\.\[24\]s, 0" } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-fcm-ge-d.c b/gcc/testsuite/gcc.target/aarch64/vect-fcm-ge-d.c
index 868e1f8..b922833 100644
--- a/gcc/testsuite/gcc.target/aarch64/vect-fcm-ge-d.c
+++ b/gcc/testsuite/gcc.target/aarch64/vect-fcm-ge-d.c
@@ -7,8 +7,9 @@ 
 
 #include "vect-fcm.x"
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 3 "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 4 "vect" } } */
 /* { dg-final { scan-assembler "fcmge\\tv\[0-9\]+\.2d, v\[0-9\]+\.2d, v\[0-9\]+\.2d" } } */
 /* { dg-final { scan-assembler "fcmge\\tv\[0-9\]+\.2d, v\[0-9\]+\.2d, 0" } } */
+/* { dg-final { scan-assembler "fcmlt\\tv\[0-9\]+\.2d, v\[0-9\]+\.2d, 0" } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 /* { dg-final { cleanup-saved-temps } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-fcm-ge-f.c b/gcc/testsuite/gcc.target/aarch64/vect-fcm-ge-f.c
index e3258f3..04d3533 100644
--- a/gcc/testsuite/gcc.target/aarch64/vect-fcm-ge-f.c
+++ b/gcc/testsuite/gcc.target/aarch64/vect-fcm-ge-f.c
@@ -7,8 +7,9 @@ 
 
 #include "vect-fcm.x"
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 3 "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 4 "vect" } } */
 /* { dg-final { scan-assembler "fcmge\\tv\[0-9\]+\.\[24\]s, v\[0-9\]+\.\[24\]s, v\[0-9\]+\.\[24\]s" } } */
 /* { dg-final { scan-assembler "fcmge\\tv\[0-9\]+\.\[24\]s, v\[0-9\]+\.\[24\]s, 0" } } */
+/* { dg-final { scan-assembler "fcmlt\\tv\[0-9\]+\.\[24\]s, v\[0-9\]+\.\[24\]s, 0" } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 /* { dg-final { cleanup-saved-temps } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-fcm-gt-d.c b/gcc/testsuite/gcc.target/aarch64/vect-fcm-gt-d.c
index ed8b452..421a04a 100644
--- a/gcc/testsuite/gcc.target/aarch64/vect-fcm-gt-d.c
+++ b/gcc/testsuite/gcc.target/aarch64/vect-fcm-gt-d.c
@@ -7,8 +7,9 @@ 
 
 #include "vect-fcm.x"
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 3 "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 4 "vect" } } */
 /* { dg-final { scan-assembler "fcmgt\\tv\[0-9\]+\.2d, v\[0-9\]+\.2d, v\[0-9\]+\.2d" } } */
 /* { dg-final { scan-assembler "fcmgt\\tv\[0-9\]+\.2d, v\[0-9\]+\.2d, 0" } } */
+/* { dg-final { scan-assembler "fcmle\\tv\[0-9\]+\.2d, v\[0-9\]+\.2d, 0" } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 /* { dg-final { cleanup-saved-temps } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-fcm-gt-f.c b/gcc/testsuite/gcc.target/aarch64/vect-fcm-gt-f.c
index e90a875..cdeab14 100644
--- a/gcc/testsuite/gcc.target/aarch64/vect-fcm-gt-f.c
+++ b/gcc/testsuite/gcc.target/aarch64/vect-fcm-gt-f.c
@@ -7,8 +7,9 @@ 
 
 #include "vect-fcm.x"
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 3 "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 4 "vect" } } */
 /* { dg-final { scan-assembler "fcmgt\\tv\[0-9\]+\.\[24\]s, v\[0-9\]+\.\[24\]s, v\[0-9\]+\.\[24\]s" } } */
 /* { dg-final { scan-assembler "fcmgt\\tv\[0-9\]+\.\[24\]s, v\[0-9\]+\.\[24\]s, 0" } } */
+/* { dg-final { scan-assembler "fcmle\\tv\[0-9\]+\.\[24\]s, v\[0-9\]+\.\[24\]s, 0" } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 /* { dg-final { cleanup-saved-temps } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-fcm.x b/gcc/testsuite/gcc.target/aarch64/vect-fcm.x
index 7e51bef..803861b 100644
--- a/gcc/testsuite/gcc.target/aarch64/vect-fcm.x
+++ b/gcc/testsuite/gcc.target/aarch64/vect-fcm.x
@@ -40,6 +40,15 @@  foobar (FTYPE *in1, FTYPE *in2, FTYPE *output)
     output[i] = (in1[i] OP 0.0) ? 4.0 : 2.0;
 }
 
+void
+foobarbar (FTYPE *in1, FTYPE *in2, FTYPE *output)
+{
+  int i = 0;
+  /* Vectorizable.  */
+  for (i = 0; i < N; i++)
+    output[i] = (in1[i] INV_OP 0.0) ? 4.0 : 2.0;
+}
+
 int
 main (int argc, char **argv)
 {
@@ -51,6 +60,11 @@  main (int argc, char **argv)
   for (i = 0; i < N; i++)
     if (out1[i] != out2[i])
       abort ();
+  foobar (input1, input2, out1);
+  foobarbar (input1, input2, out2);
+  for (i = 0; i < N; i++)
+    if (out1[i] == out2[i])
+      abort ();
   return 0;
 }