diff mbox

LRA: Fix caller-save store/restore instruction for large mode

Message ID CA+yXCZD++vCJa2E6Ew5Ks3DT+iaJA35t1kQ08UKGxSef2FHn2A@mail.gmail.com
State New
Headers show

Commit Message

Kito Cheng Jan. 7, 2015, 8:03 a.m. UTC
Hi Jeff:

It's updated patch,bootstrapped and run regression tested on arm-eabi,
arm-none-linux-uclibcgnueabi, x86_64-unknown-linux-gnu and nds32le-elf
without introducing regression.

Thanks for your review :)

2015-01-07  Kito Cheng  <kito@0xlab.org>

        PR target/64348
        * lra-constraints.c (split_reg): Fix caller-save store/restore
instruction generation.

Comments

Bin.Cheng Jan. 7, 2015, 8:50 a.m. UTC | #1
On Wed, Jan 7, 2015 at 4:03 PM, Kito Cheng <kito.cheng@gmail.com> wrote:
> Hi Jeff:
>
> It's updated patch,bootstrapped and run regression tested on arm-eabi,
> arm-none-linux-uclibcgnueabi, x86_64-unknown-linux-gnu and nds32le-elf
> without introducing regression.
>
> Thanks for your review :)
>
> 2015-01-07  Kito Cheng  <kito@0xlab.org>
>
>         PR target/64348
>         * lra-constraints.c (split_reg): Fix caller-save store/restore
> instruction generation.

Thanks for fixing the issue.
The PR is against existing testcase failure
gcc.c-torture/execute/scal-to-vec1.c.  Unless we can create a new
case, there is no need to include same case twice I think?  Or we can
mention the PR number in the original test case?

Thanks,
bin
Kito Cheng Jan. 7, 2015, 12:28 p.m. UTC | #2
Hi Bin:

It's 2 more line than gcc.c-torture/execute/scal-to-vec1.c since it's
need specific compilation
flag and specific target to reproduce this issue,
and it's can't reproduce by normal testing flow with
arm-*-linux-gnueabi (due to lack -fPIC flag),
so I prefer duplicate this case into gcc.target/arm/ :)

/* { dg-do compile } */
/* { dg-options "-O3 -fPIC -marm -mcpu=cortex-a8" } */

On Wed, Jan 7, 2015 at 4:50 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Wed, Jan 7, 2015 at 4:03 PM, Kito Cheng <kito.cheng@gmail.com> wrote:
>> Hi Jeff:
>>
>> It's updated patch,bootstrapped and run regression tested on arm-eabi,
>> arm-none-linux-uclibcgnueabi, x86_64-unknown-linux-gnu and nds32le-elf
>> without introducing regression.
>>
>> Thanks for your review :)
>>
>> 2015-01-07  Kito Cheng  <kito@0xlab.org>
>>
>>         PR target/64348
>>         * lra-constraints.c (split_reg): Fix caller-save store/restore
>> instruction generation.
>
> Thanks for fixing the issue.
> The PR is against existing testcase failure
> gcc.c-torture/execute/scal-to-vec1.c.  Unless we can create a new
> case, there is no need to include same case twice I think?  Or we can
> mention the PR number in the original test case?
>
> Thanks,
> bin
Bin.Cheng Jan. 8, 2015, 1:29 a.m. UTC | #3
On Wed, Jan 7, 2015 at 8:28 PM, Kito Cheng <kito.cheng@gmail.com> wrote:
> Hi Bin:
>
> It's 2 more line than gcc.c-torture/execute/scal-to-vec1.c since it's
> need specific compilation
> flag and specific target to reproduce this issue,
> and it's can't reproduce by normal testing flow with
> arm-*-linux-gnueabi (due to lack -fPIC flag),
> so I prefer duplicate this case into gcc.target/arm/ :)
>
> /* { dg-do compile } */
> /* { dg-options "-O3 -fPIC -marm -mcpu=cortex-a8" } */
Not really, we generally want to avoid cpu related options in testcase
since it introduces conflict option failures when testing against
specific processor, e.g. testing against Cortex-M profile processors.

Thanks,
bin

>
> On Wed, Jan 7, 2015 at 4:50 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>> On Wed, Jan 7, 2015 at 4:03 PM, Kito Cheng <kito.cheng@gmail.com> wrote:
>>> Hi Jeff:
>>>
>>> It's updated patch,bootstrapped and run regression tested on arm-eabi,
>>> arm-none-linux-uclibcgnueabi, x86_64-unknown-linux-gnu and nds32le-elf
>>> without introducing regression.
>>>
>>> Thanks for your review :)
>>>
>>> 2015-01-07  Kito Cheng  <kito@0xlab.org>
>>>
>>>         PR target/64348
>>>         * lra-constraints.c (split_reg): Fix caller-save store/restore
>>> instruction generation.
>>
>> Thanks for fixing the issue.
>> The PR is against existing testcase failure
>> gcc.c-torture/execute/scal-to-vec1.c.  Unless we can create a new
>> case, there is no need to include same case twice I think?  Or we can
>> mention the PR number in the original test case?
>>
>> Thanks,
>> bin
diff mbox

Patch

From b6a697ee5697b84082f6e619f4a435e6c750d695 Mon Sep 17 00:00:00 2001
From: Kito Cheng <kito@andestech.com>
Date: Thu, 29 May 2014 23:53:23 +0800
Subject: [PATCH] Fix caller-save store/restore instruction for large mode in
 lra

 - The original code assume store/restore will always have only
   one insn, it's will fail if you split in move pattern!!!
---
 gcc/lra-constraints.c                  | 14 +++---
 gcc/testsuite/gcc.target/arm/pr64348.c | 89 ++++++++++++++++++++++++++++++++++
 2 files changed, 95 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/pr64348.c

diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
index 96b10a1..f8bc12f 100644
--- a/gcc/lra-constraints.c
+++ b/gcc/lra-constraints.c
@@ -4918,15 +4918,14 @@  split_reg (bool before_p, int original_regno, rtx_insn *insn,
       reg_renumber[REGNO (new_reg)] = hard_regno;
     }
   save = emit_spill_move (true, new_reg, original_reg);
-  if (NEXT_INSN (save) != NULL_RTX)
+  if (NEXT_INSN (save) != NULL_RTX && !call_save_p)
     {
-      lra_assert (! call_save_p);
       if (lra_dump_file != NULL)
 	{
 	  fprintf
 	    (lra_dump_file,
-	     "	  Rejecting split %d->%d resulting in > 2 %s save insns:\n",
-	     original_regno, REGNO (new_reg), call_save_p ? "call" : "");
+	     "	  Rejecting split %d->%d resulting in > 2 save insns:\n",
+	     original_regno, REGNO (new_reg));
 	  dump_rtl_slim (lra_dump_file, save, NULL, -1, 0);
 	  fprintf (lra_dump_file,
 		   "	))))))))))))))))))))))))))))))))))))))))))))))))\n");
@@ -4934,15 +4933,14 @@  split_reg (bool before_p, int original_regno, rtx_insn *insn,
       return false;
     }
   restore = emit_spill_move (false, new_reg, original_reg);
-  if (NEXT_INSN (restore) != NULL_RTX)
+  if (NEXT_INSN (restore) != NULL_RTX && !call_save_p)
     {
-      lra_assert (! call_save_p);
       if (lra_dump_file != NULL)
 	{
 	  fprintf (lra_dump_file,
 		   "	Rejecting split %d->%d "
-		   "resulting in > 2 %s restore insns:\n",
-		   original_regno, REGNO (new_reg), call_save_p ? "call" : "");
+		   "resulting in > 2 restore insns:\n",
+		   original_regno, REGNO (new_reg));
 	  dump_rtl_slim (lra_dump_file, restore, NULL, -1, 0);
 	  fprintf (lra_dump_file,
 		   "	))))))))))))))))))))))))))))))))))))))))))))))))\n");
diff --git a/gcc/testsuite/gcc.target/arm/pr64348.c b/gcc/testsuite/gcc.target/arm/pr64348.c
new file mode 100644
index 0000000..b949f35
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr64348.c
@@ -0,0 +1,89 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O3 -fPIC -marm -mcpu=cortex-a8" } */
+
+#define vector(elcount, type)  \
+__attribute__((vector_size((elcount)*sizeof(type)))) type
+
+#define vidx(type, vec, idx) (*((type *) &(vec) + idx))
+
+#define operl(a, b, op) (a op b)
+#define operr(a, b, op) (b op a)
+
+#define check(type, count, vec0, vec1, num, op, lr) \
+do {\
+    int __i; \
+    for (__i = 0; __i < count; __i++) {\
+        if (vidx (type, vec1, __i) != oper##lr (num, vidx (type, vec0, __i), op)) \
+            __builtin_abort (); \
+    }\
+} while (0)
+
+#define veccompare(type, count, v0, v1) \
+do {\
+    int __i; \
+    for (__i = 0; __i < count; __i++) { \
+        if (vidx (type, v0, __i) != vidx (type, v1, __i)) \
+            __builtin_abort (); \
+    } \
+} while (0)
+
+volatile int one = 1;
+
+int main (int argc, char *argv[]) {
+#define fvec_2 (vector(4, float)){2., 2., 2., 2.}
+#define dvec_2 (vector(2, double)){2., 2.}
+
+
+    vector(8, short) v0 = {one, 1, 2, 3, 4, 5, 6, 7};
+    vector(8, short) v1;
+
+    vector(4, float) f0 = {1., 2., 3., 4.};
+    vector(4, float) f1, f2;
+
+    vector(2, double) d0 = {1., 2.};
+    vector(2, double) d1, d2;
+
+
+
+    v1 = 2 + v0;   check (short, 8, v0, v1, 2, +, l);
+    v1 = 2 - v0;   check (short, 8, v0, v1, 2, -, l);
+    v1 = 2 * v0;   check (short, 8, v0, v1, 2, *, l);
+    v1 = 2 / v0;   check (short, 8, v0, v1, 2, /, l);
+    v1 = 2 % v0;   check (short, 8, v0, v1, 2, %, l);
+    v1 = 2 ^ v0;   check (short, 8, v0, v1, 2, ^, l);
+    v1 = 2 & v0;   check (short, 8, v0, v1, 2, &, l);
+    v1 = 2 | v0;   check (short, 8, v0, v1, 2, |, l);
+    v1 = 2 << v0;   check (short, 8, v0, v1, 2, <<, l);
+    v1 = 2 >> v0;   check (short, 8, v0, v1, 2, >>, l);
+
+    v1 = v0 + 2;   check (short, 8, v0, v1, 2, +, r);
+    v1 = v0 - 2;   check (short, 8, v0, v1, 2, -, r);
+    v1 = v0 * 2;   check (short, 8, v0, v1, 2, *, r);
+    v1 = v0 / 2;   check (short, 8, v0, v1, 2, /, r);
+    v1 = v0 % 2;   check (short, 8, v0, v1, 2, %, r);
+    v1 = v0 ^ 2;   check (short, 8, v0, v1, 2, ^, r);
+    v1 = v0 & 2;   check (short, 8, v0, v1, 2, &, r);
+    v1 = v0 | 2;   check (short, 8, v0, v1, 2, |, r);
+
+    f1 = 2. + f0;  f2 = fvec_2 + f0; veccompare (float, 4, f1, f2);
+    f1 = 2. - f0;  f2 = fvec_2 - f0; veccompare (float, 4, f1, f2);
+    f1 = 2. * f0;  f2 = fvec_2 * f0; veccompare (float, 4, f1, f2);
+    f1 = 2. / f0;  f2 = fvec_2 / f0; veccompare (float, 4, f1, f2);
+
+    f1 = f0 + 2.;  f2 = f0 + fvec_2; veccompare (float, 4, f1, f2);
+    f1 = f0 - 2.;  f2 = f0 - fvec_2; veccompare (float, 4, f1, f2);
+    f1 = f0 * 2.;  f2 = f0 * fvec_2; veccompare (float, 4, f1, f2);
+    f1 = f0 / 2.;  f2 = f0 / fvec_2; veccompare (float, 4, f1, f2);
+
+    d1 = 2. + d0;  d2 = dvec_2 + d0; veccompare (double, 2, d1, d2);
+    d1 = 2. - d0;  d2 = dvec_2 - d0; veccompare (double, 2, d1, d2);
+    d1 = 2. * d0;  d2 = dvec_2 * d0; veccompare (double, 2, d1, d2);
+    d1 = 2. / d0;  d2 = dvec_2 / d0; veccompare (double, 2, d1, d2);
+
+    d1 = d0 + 2.;  d2 = d0 + dvec_2; veccompare (double, 2, d1, d2);
+    d1 = d0 - 2.;  d2 = d0 - dvec_2; veccompare (double, 2, d1, d2);
+    d1 = d0 * 2.;  d2 = d0 * dvec_2; veccompare (double, 2, d1, d2);
+    d1 = d0 / 2.;  d2 = d0 / dvec_2; veccompare (double, 2, d1, d2);
+
+    return 0;
+}
-- 
2.1.0