diff mbox

[auto-inc-dec.c] Account for cost of move operation in FORM_PRE_ADD and FORM_POST_ADD cases

Message ID 562E13EC.3010305@arm.com
State New
Headers show

Commit Message

Kyrylo Tkachov Oct. 26, 2015, 11:52 a.m. UTC
On 26/10/15 11:28, Bernd Schmidt wrote:
> On 10/26/2015 12:12 PM, Bernd Schmidt wrote:
>>
>> But isn't that balanced by the fact that it doesn't seem to take into
>> account the gain of removing the inc_insn either? So I think this can't
>> be right.
>
> Argh, misread the code. The patch is OK with the change I suggested.
>

Thanks!
Here's what I committed with r229344.

Kyrill

2015-10-26  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     * auto-inc-dec.c (insert_move_insn_before): Delete.
     (attempt_change): Remember to cost the simple move in the
     FORM_PRE_ADD and FORM_POST_ADD cases.

>
> Bernd
>

Comments

Christophe Lyon Oct. 28, 2015, 5:23 p.m. UTC | #1
Hi Kyrill,

On 26 October 2015 at 12:52, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:
>
> On 26/10/15 11:28, Bernd Schmidt wrote:
>>
>> On 10/26/2015 12:12 PM, Bernd Schmidt wrote:
>>>
>>>
>>> But isn't that balanced by the fact that it doesn't seem to take into
>>> account the gain of removing the inc_insn either? So I think this can't
>>> be right.
>>
>>
>> Argh, misread the code. The patch is OK with the change I suggested.
>>
>
> Thanks!
> Here's what I committed with r229344.
>
Since this commit, I've noticed:

FAIL: gcc.target/arm/lp1243022.c scan-rtl-dump subreg2 "REG_INC"
with --target arm-none-eabi --with-thumb --with-cpu=cortex-a9

as well as ICEs:
  gcc.target/aarch64/advsimd-intrinsics/vldX.c   -O2  (internal compiler error)
  gcc.target/aarch64/advsimd-intrinsics/vldX.c   -O2 -flto
-fno-use-linker-plugin -flto-partition=none  (internal compiler error)

with --target arm-none-linux-gnueabihf --with-thumb
--with-cpu=cortex-a15 --with-fpu=neon-vfpv4
and --target arm-none-linux-gnueabihf --with-thumb
--with-cpu=cortex-a57 --with-fpu=crypto-neon-fp-armv8

See for a more synthetic view:
http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/229344/report-build-info.html

Can you have a look?

Thanks,

Christophe.

> Kyrill
>
> 2015-10-26  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>     * auto-inc-dec.c (insert_move_insn_before): Delete.
>     (attempt_change): Remember to cost the simple move in the
>     FORM_PRE_ADD and FORM_POST_ADD cases.
>
>>
>> Bernd
>>
>
Kyrylo Tkachov Oct. 28, 2015, 5:45 p.m. UTC | #2
On 28/10/15 17:23, Christophe Lyon wrote:
> Hi Kyrill,

Hi Christophe,

>
> On 26 October 2015 at 12:52, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:
>> On 26/10/15 11:28, Bernd Schmidt wrote:
>>> On 10/26/2015 12:12 PM, Bernd Schmidt wrote:
>>>>
>>>> But isn't that balanced by the fact that it doesn't seem to take into
>>>> account the gain of removing the inc_insn either? So I think this can't
>>>> be right.
>>>
>>> Argh, misread the code. The patch is OK with the change I suggested.
>>>
>> Thanks!
>> Here's what I committed with r229344.
>>
> Since this commit, I've noticed:
>
> FAIL: gcc.target/arm/lp1243022.c scan-rtl-dump subreg2 "REG_INC"
> with --target arm-none-eabi --with-thumb --with-cpu=cortex-a9

I think this is a testcase issue, I'll look into updating it separately.

> as well as ICEs:
>    gcc.target/aarch64/advsimd-intrinsics/vldX.c   -O2  (internal compiler error)
>    gcc.target/aarch64/advsimd-intrinsics/vldX.c   -O2 -flto
> -fno-use-linker-plugin -flto-partition=none  (internal compiler error)
>
> with --target arm-none-linux-gnueabihf --with-thumb
> --with-cpu=cortex-a15 --with-fpu=neon-vfpv4
> and --target arm-none-linux-gnueabihf --with-thumb
> --with-cpu=cortex-a57 --with-fpu=crypto-neon-fp-armv8
>
> See for a more synthetic view:
> http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/229344/report-build-info.html
>
> Can you have a look?

The ICE backtrace is:
0x7a4494 change_address_1
         $SRC/gcc/emit-rtl.c:2132
0x7a7453 adjust_address_1(rtx_def*, machine_mode, long, int, int, int, long)
         $SRC/gcc/emit-rtl.c:2270
0xab792d gen_lowpart_general(machine_mode, rtx_def*)
         $SRC/gcc/rtlhooks.c:90
0xfd721d gen_split_47(rtx_insn*, rtx_def**)
         $SRC/gcc/config/arm/arm.md:4336
0x12473f2 split_11
         $SRC/gcc/config/arm/arm.md:4331
0x12473f2 split_insns(rtx_def*, rtx_insn*)
         $SRC/gcc/config/arm/sync.md:361
0x7af30e try_split(rtx_def*, rtx_insn*, int)
         $SRC/gcc/emit-rtl.c:3664
0xa53dc5 split_insn
         $SRC/gcc/recog.c:2874
0xa5ccf5 split_all_insns()
         $SRC/gcc/recog.c:2964
0xa5cde3 rest_of_handle_split_after_reload
         $SRC/gcc/recog.c:3900
0xa5cde3 execute
         $SRC/gcc/recog.c:3929


Looks like a latent issue exposed by my patch.
Could you please file a BZ entry if get the chance?

Thanks,
Kyrill


> Thanks,
>
> Christophe.
>
>> Kyrill
>>
>> 2015-10-26  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>
>>      * auto-inc-dec.c (insert_move_insn_before): Delete.
>>      (attempt_change): Remember to cost the simple move in the
>>      FORM_PRE_ADD and FORM_POST_ADD cases.
>>
>>> Bernd
>>>
Kyrylo Tkachov Oct. 29, 2015, 3:53 p.m. UTC | #3
On 28/10/15 17:45, Kyrill Tkachov wrote:
>
> On 28/10/15 17:23, Christophe Lyon wrote:
>> Hi Kyrill,
>
> Hi Christophe,
>
>>
>> On 26 October 2015 at 12:52, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:
>>> On 26/10/15 11:28, Bernd Schmidt wrote:
>>>> On 10/26/2015 12:12 PM, Bernd Schmidt wrote:
>>>>>
>>>>> But isn't that balanced by the fact that it doesn't seem to take into
>>>>> account the gain of removing the inc_insn either? So I think this can't
>>>>> be right.
>>>>
>>>> Argh, misread the code. The patch is OK with the change I suggested.
>>>>
>>> Thanks!
>>> Here's what I committed with r229344.
>>>
>> Since this commit, I've noticed:
>>
>> FAIL: gcc.target/arm/lp1243022.c scan-rtl-dump subreg2 "REG_INC"
>> with --target arm-none-eabi --with-thumb --with-cpu=cortex-a9
>
> I think this is a testcase issue, I'll look into updating it separately.
>
>> as well as ICEs:
>>    gcc.target/aarch64/advsimd-intrinsics/vldX.c   -O2  (internal compiler error)
>>    gcc.target/aarch64/advsimd-intrinsics/vldX.c   -O2 -flto
>> -fno-use-linker-plugin -flto-partition=none  (internal compiler error)
>>
>> with --target arm-none-linux-gnueabihf --with-thumb
>> --with-cpu=cortex-a15 --with-fpu=neon-vfpv4
>> and --target arm-none-linux-gnueabihf --with-thumb
>> --with-cpu=cortex-a57 --with-fpu=crypto-neon-fp-armv8
>>
>> See for a more synthetic view:
>> http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/229344/report-build-info.html
>>
>> Can you have a look?
>
> The ICE backtrace is:
> 0x7a4494 change_address_1
>         $SRC/gcc/emit-rtl.c:2132
> 0x7a7453 adjust_address_1(rtx_def*, machine_mode, long, int, int, int, long)
>         $SRC/gcc/emit-rtl.c:2270
> 0xab792d gen_lowpart_general(machine_mode, rtx_def*)
>         $SRC/gcc/rtlhooks.c:90
> 0xfd721d gen_split_47(rtx_insn*, rtx_def**)
>         $SRC/gcc/config/arm/arm.md:4336
> 0x12473f2 split_11
>         $SRC/gcc/config/arm/arm.md:4331
> 0x12473f2 split_insns(rtx_def*, rtx_insn*)
>         $SRC/gcc/config/arm/sync.md:361
> 0x7af30e try_split(rtx_def*, rtx_insn*, int)
>         $SRC/gcc/emit-rtl.c:3664
> 0xa53dc5 split_insn
>         $SRC/gcc/recog.c:2874
> 0xa5ccf5 split_all_insns()
>         $SRC/gcc/recog.c:2964
> 0xa5cde3 rest_of_handle_split_after_reload
>         $SRC/gcc/recog.c:3900
> 0xa5cde3 execute
>         $SRC/gcc/recog.c:3929
>
>
> Looks like a latent issue exposed by my patch.
> Could you please file a BZ entry if get the chance?
>

I've filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68149 for this one.

Kyrill

> Thanks,
> Kyrill
>
>
>> Thanks,
>>
>> Christophe.
>>
>>> Kyrill
>>>
>>> 2015-10-26  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>>
>>>      * auto-inc-dec.c (insert_move_insn_before): Delete.
>>>      (attempt_change): Remember to cost the simple move in the
>>>      FORM_PRE_ADD and FORM_POST_ADD cases.
>>>
>>>> Bernd
>>>>
>
diff mbox

Patch

commit cc7c4748eac1f9d59ceb5393132c098aba99765d
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date:   Fri Oct 16 13:46:51 2015 +0100

    [auto-inc-dec.c] Account for cost of move operation in FORM_PRE_ADD and FORM_POST_ADD cases

diff --git a/gcc/auto-inc-dec.c b/gcc/auto-inc-dec.c
index e003b13..9f7c8e0 100644
--- a/gcc/auto-inc-dec.c
+++ b/gcc/auto-inc-dec.c
@@ -439,24 +439,6 @@  move_dead_notes (rtx_insn *to_insn, rtx_insn *from_insn, rtx pattern)
     }
 }
 
-
-/* Create a mov insn DEST_REG <- SRC_REG and insert it before
-   NEXT_INSN.  */
-
-static rtx_insn *
-insert_move_insn_before (rtx_insn *next_insn, rtx dest_reg, rtx src_reg)
-{
-  rtx_insn *insns;
-
-  start_sequence ();
-  emit_move_insn (dest_reg, src_reg);
-  insns = get_insns ();
-  end_sequence ();
-  emit_insn_before (insns, next_insn);
-  return insns;
-}
-
-
 /* Change mem_insn.mem_loc so that uses NEW_ADDR which has an
    increment of INC_REG.  To have reached this point, the change is a
    legitimate one from a dataflow point of view.  The only questions
@@ -490,8 +472,21 @@  attempt_change (rtx new_addr, rtx inc_reg)
 
   old_cost = (set_src_cost (mem, mode, speed)
 	      + set_rtx_cost (PATTERN (inc_insn.insn), speed));
+
   new_cost = set_src_cost (mem_tmp, mode, speed);
 
+  /* In the FORM_PRE_ADD and FORM_POST_ADD cases we emit an extra move
+     whose cost we should account for.  */
+  if (inc_insn.form == FORM_PRE_ADD
+      || inc_insn.form == FORM_POST_ADD)
+    {
+      start_sequence ();
+      emit_move_insn (inc_insn.reg_res, inc_insn.reg0);
+      mov_insn = get_insns ();
+      end_sequence ();
+      new_cost += seq_cost (mov_insn, speed);
+    }
+
   /* The first item of business is to see if this is profitable.  */
   if (old_cost < new_cost)
     {
@@ -522,8 +517,8 @@  attempt_change (rtx new_addr, rtx inc_reg)
       /* Replace the addition with a move.  Do it at the location of
 	 the addition since the operand of the addition may change
 	 before the memory reference.  */
-      mov_insn = insert_move_insn_before (inc_insn.insn,
-					  inc_insn.reg_res, inc_insn.reg0);
+      gcc_assert (mov_insn);
+      emit_insn_before (mov_insn, inc_insn.insn);
       move_dead_notes (mov_insn, inc_insn.insn, inc_insn.reg0);
 
       regno = REGNO (inc_insn.reg_res);
@@ -548,8 +543,8 @@  attempt_change (rtx new_addr, rtx inc_reg)
       break;
 
     case FORM_POST_ADD:
-      mov_insn = insert_move_insn_before (mem_insn.insn,
-					  inc_insn.reg_res, inc_insn.reg0);
+      gcc_assert (mov_insn);
+      emit_insn_before (mov_insn, mem_insn.insn);
       move_dead_notes (mov_insn, inc_insn.insn, inc_insn.reg0);
 
       /* Do not move anything to the mov insn because the instruction