diff mbox

LRA: Fix caller-save store/restore instruction for large mode

Message ID CA+yXCZCO=aabNSt-gJuj==bH7i9dNPr6-T+oz4jkvwdJAdfpfA@mail.gmail.com
State New
Headers show

Commit Message

Kito Cheng Jan. 5, 2015, 7:44 a.m. UTC
Hi Vladimir:
  This patch has a discusses with you in May 2014, this patch is about
the caller-save register store and restore instruction generation, the
current LRA implementation will miss caller-save store/restore
instruction if need one more instruction.

You said you will investigate for this on IRC, so I don't send the
patch last year, however ARM guys seem got this problem too, so I
think it's time to send this patch :)

ChangeLog

2015-01-05  Kito Cheng  <kito@0xlab.org>

        * lra-constraints.c (split_reg): Fix caller-save store/restore
instruction generation.

Comments

Bin.Cheng Jan. 5, 2015, 8:13 a.m. UTC | #1
On Mon, Jan 5, 2015 at 3:44 PM, Kito Cheng <kito.cheng@gmail.com> wrote:
> Hi Vladimir:
>   This patch has a discusses with you in May 2014, this patch is about
> the caller-save register store and restore instruction generation, the
> current LRA implementation will miss caller-save store/restore
> instruction if need one more instruction.
>
> You said you will investigate for this on IRC, so I don't send the
> patch last year, however ARM guys seem got this problem too, so I
> think it's time to send this patch :)
>
> ChangeLog
>
> 2015-01-05  Kito Cheng  <kito@0xlab.org>
>
>         * lra-constraints.c (split_reg): Fix caller-save store/restore
> instruction generation.
Hi,
Thanks for saving my work, I was going to send this exact patch.
Note that we have PR64348 tracking the issue now.

Thanks,
bin
Jeff Law Jan. 5, 2015, 5:31 p.m. UTC | #2
On 01/05/15 00:44, Kito Cheng wrote:
> Hi Vladimir:
>    This patch has a discusses with you in May 2014, this patch is about
> the caller-save register store and restore instruction generation, the
> current LRA implementation will miss caller-save store/restore
> instruction if need one more instruction.
>
> You said you will investigate for this on IRC, so I don't send the
> patch last year, however ARM guys seem got this problem too, so I
> think it's time to send this patch :)
>
> ChangeLog
>
> 2015-01-05  Kito Cheng  <kito@0xlab.org>
>
>          * lra-constraints.c (split_reg): Fix caller-save store/restore
> instruction generation.
Please reference PR64348 in the ChangeLog entry.

Please include a testcase if there isn't one in the regression suite 
already.

Please indicate what platform this patch was bootstrapped and regression 
tested on.

The dumping code immediately after the assert you removed has code like 
this in both cases:



          fprintf (lra_dump_file,
                    "    Rejecting split %d->%d "
                    "resulting in > 2 %s restore insns:\n",
                    original_regno, REGNO (new_reg), call_save_p ? 
"call" : "");

Testing call_save_p here won't make any sense after your patch.

I'll let Vlad chime in on the correctness of allowing multi register 
saves/restores in this code.

Jeff
Vladimir Makarov Jan. 5, 2015, 11:36 p.m. UTC | #3
On 2015-01-05 12:31 PM, Jeff Law wrote:
> On 01/05/15 00:44, Kito Cheng wrote:
>> Hi Vladimir:
>>    This patch has a discusses with you in May 2014, this patch is about
>> the caller-save register store and restore instruction generation, the
>> current LRA implementation will miss caller-save store/restore
>> instruction if need one more instruction.
>>
>> You said you will investigate for this on IRC, so I don't send the
>> patch last year, however ARM guys seem got this problem too, so I
>> think it's time to send this patch :)
>>
>> ChangeLog
>>
>> 2015-01-05  Kito Cheng  <kito@0xlab.org>
>>
>>          * lra-constraints.c (split_reg): Fix caller-save store/restore
>> instruction generation.
> Please reference PR64348 in the ChangeLog entry.
>
> Please include a testcase if there isn't one in the regression suite 
> already.
>
> Please indicate what platform this patch was bootstrapped and 
> regression tested on.
>
> The dumping code immediately after the assert you removed has code 
> like this in both cases:
>
>
>
>          fprintf (lra_dump_file,
>                    "    Rejecting split %d->%d "
>                    "resulting in > 2 %s restore insns:\n",
>                    original_regno, REGNO (new_reg), call_save_p ? 
> "call" : "");
>
> Testing call_save_p here won't make any sense after your patch.
>
> I'll let Vlad chime in on the correctness of allowing multi register 
> saves/restores in this code.
>
The solution itself is ok.  Prohibiting generation of more one insn was 
intended for inheritance only as inheritance transformation can be 
undone when the inheritance pseudo does not get a hard register. Undoing 
multi-register splitting is difficult and also such splitting is 
doubtedly profitable.

Splitting for save/restore is never undone.  So it is ok for this case 
to generate multi-register saves/restores.

Kito, Jeff wrote reasonable changes for the patch.  Please, do them and 
you can commit the patch.

Thanks.
Bin.Cheng Jan. 6, 2015, 1:40 a.m. UTC | #4
On Tue, Jan 6, 2015 at 7:36 AM, Vladimir Makarov <vmakarov@redhat.com> wrote:
>
> On 2015-01-05 12:31 PM, Jeff Law wrote:
>>
>> On 01/05/15 00:44, Kito Cheng wrote:
>>>
>>> Hi Vladimir:
>>>    This patch has a discusses with you in May 2014, this patch is about
>>> the caller-save register store and restore instruction generation, the
>>> current LRA implementation will miss caller-save store/restore
>>> instruction if need one more instruction.
>>>
>>> You said you will investigate for this on IRC, so I don't send the
>>> patch last year, however ARM guys seem got this problem too, so I
>>> think it's time to send this patch :)
>>>
>>> ChangeLog
>>>
>>> 2015-01-05  Kito Cheng  <kito@0xlab.org>
>>>
>>>          * lra-constraints.c (split_reg): Fix caller-save store/restore
>>> instruction generation.
>>
>> Please reference PR64348 in the ChangeLog entry.
>>
>> Please include a testcase if there isn't one in the regression suite
>> already.
>>
>> Please indicate what platform this patch was bootstrapped and regression
>> tested on.
>>
>> The dumping code immediately after the assert you removed has code like
>> this in both cases:
>>
>>
>>
>>          fprintf (lra_dump_file,
>>                    "    Rejecting split %d->%d "
>>                    "resulting in > 2 %s restore insns:\n",
>>                    original_regno, REGNO (new_reg), call_save_p ? "call" :
>> "");
>>
>> Testing call_save_p here won't make any sense after your patch.
>>
>> I'll let Vlad chime in on the correctness of allowing multi register
>> saves/restores in this code.
>>
> The solution itself is ok.  Prohibiting generation of more one insn was
> intended for inheritance only as inheritance transformation can be undone
> when the inheritance pseudo does not get a hard register. Undoing
> multi-register splitting is difficult and also such splitting is doubtedly
> profitable.
>
> Splitting for save/restore is never undone.  So it is ok for this case to
> generate multi-register saves/restores.
>
> Kito, Jeff wrote reasonable changes for the patch.  Please, do them and you
> can commit the patch.

Hi Vlad,
As for this specific case in PR64348, dump IR crossing function call
is as below:

  430: [sfp:SI-0x30]=r989:TI#0
  432: [r1706:SI+0x4]=r989:TI#4
  434: [r1706:SI+0x8]=r989:TI#8
  436: [r1706:SI+0xc]=r989:TI#12
  441: r0:DI=call [`__aeabi_idivmod'] argc:0
      REG_UNUSED r0:SI
      REG_CALL_DECL `__aeabi_idivmod'
      REG_EH_REGION 0xffffffff80000000
  437: r1007:SI=sign_extend(r989:TI#0)
      REG_DEAD r989:TI

Save/restore are introduced because of use of r989:#0 in insn 437.
Register r989 is TImode register and assigned to r2/r3/r4/r5.  I can
think about two possible improvements to LRA splitter.  1) According
to ARM EABI, only r2/r3 need to be saved/restored; 2) In this case, we
only need to save/restore r989:TI#0, which is r2.  In fact, it has
already been saved to stack in insn 430, what we need is to restore
r989:TI#0 after function call.

What do you think about these?

Thanks,
bin

>
> Thanks.
>
Vladimir Makarov Jan. 7, 2015, 4:34 a.m. UTC | #5
On 2015-01-05 8:40 PM, Bin.Cheng wrote:
> On Tue, Jan 6, 2015 at 7:36 AM, Vladimir Makarov <vmakarov@redhat.com> wrote:
>> On 2015-01-05 12:31 PM, Jeff Law wrote:
>>> On 01/05/15 00:44, Kito Cheng wrote:
>>>> Hi Vladimir:
>>>>     This patch has a discusses with you in May 2014, this patch is about
>>>> the caller-save register store and restore instruction generation, the
>>>> current LRA implementation will miss caller-save store/restore
>>>> instruction if need one more instruction.
>>>>
>>>> You said you will investigate for this on IRC, so I don't send the
>>>> patch last year, however ARM guys seem got this problem too, so I
>>>> think it's time to send this patch :)
>>>>
>>>> ChangeLog
>>>>
>>>> 2015-01-05  Kito Cheng  <kito@0xlab.org>
>>>>
>>>>           * lra-constraints.c (split_reg): Fix caller-save store/restore
>>>> instruction generation.
>>> Please reference PR64348 in the ChangeLog entry.
>>>
>>> Please include a testcase if there isn't one in the regression suite
>>> already.
>>>
>>> Please indicate what platform this patch was bootstrapped and regression
>>> tested on.
>>>
>>> The dumping code immediately after the assert you removed has code like
>>> this in both cases:
>>>
>>>
>>>
>>>           fprintf (lra_dump_file,
>>>                     "    Rejecting split %d->%d "
>>>                     "resulting in > 2 %s restore insns:\n",
>>>                     original_regno, REGNO (new_reg), call_save_p ? "call" :
>>> "");
>>>
>>> Testing call_save_p here won't make any sense after your patch.
>>>
>>> I'll let Vlad chime in on the correctness of allowing multi register
>>> saves/restores in this code.
>>>
>> The solution itself is ok.  Prohibiting generation of more one insn was
>> intended for inheritance only as inheritance transformation can be undone
>> when the inheritance pseudo does not get a hard register. Undoing
>> multi-register splitting is difficult and also such splitting is doubtedly
>> profitable.
>>
>> Splitting for save/restore is never undone.  So it is ok for this case to
>> generate multi-register saves/restores.
>>
>> Kito, Jeff wrote reasonable changes for the patch.  Please, do them and you
>> can commit the patch.
> Hi Vlad,
> As for this specific case in PR64348, dump IR crossing function call
> is as below:
>
>    430: [sfp:SI-0x30]=r989:TI#0
>    432: [r1706:SI+0x4]=r989:TI#4
>    434: [r1706:SI+0x8]=r989:TI#8
>    436: [r1706:SI+0xc]=r989:TI#12
>    441: r0:DI=call [`__aeabi_idivmod'] argc:0
>        REG_UNUSED r0:SI
>        REG_CALL_DECL `__aeabi_idivmod'
>        REG_EH_REGION 0xffffffff80000000
>    437: r1007:SI=sign_extend(r989:TI#0)
>        REG_DEAD r989:TI
>
> Save/restore are introduced because of use of r989:#0 in insn 437.
> Register r989 is TImode register and assigned to r2/r3/r4/r5.  I can
> think about two possible improvements to LRA splitter.  1) According
> to ARM EABI, only r2/r3 need to be saved/restored; 2) In this case, we
> only need to save/restore r989:TI#0, which is r2.  In fact, it has
> already been saved to stack in insn 430, what we need is to restore
> r989:TI#0 after function call.
>
> What do you think about these?
>
>
Thanks for the interesting case.

It would be nice to implement this but it is hard.   For saving only r2 
we need live range analysis on sub-register level which is complicated 
(even IRA has such code only at most for 2-registers pseudos).  For 
reusing memory from insn 430 is complicated too if it is not equiv memory.

May be adding hard registers explicitly to this code could help 
(save/restore splitting is done only on the 1st LRA sub-pass; after that 
pseudos can get a call-clobbered hard register only if it does not 
intersect calls; if pseudo r989 is spilled on later LRA sub-passes we 
could remove code involving explicit subregisters). I'll think about this.
diff mbox

Patch

From 434dc294cf3f2267eed89d27b34c53c079e2b25a Mon Sep 17 00:00:00 2001
From: Kito Cheng <kito@andestech.com>
Date: Thu, 29 May 2014 23:53:23 +0800
Subject: [PATCH] Fix caller-save store/restore instruction for large mode in
 lra

 - The original code assume store/restore will always have only
   one insn, it's will fail if you split in move pattern!!!
---
 gcc/lra-constraints.c |    6 ++----
 1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
index 382281c..4dcf553 100644
--- a/gcc/lra-constraints.c
+++ b/gcc/lra-constraints.c
@@ -4918,9 +4918,8 @@  split_reg (bool before_p, int original_regno, rtx_insn *insn,
       reg_renumber[REGNO (new_reg)] = hard_regno;
     }
   save = emit_spill_move (true, new_reg, original_reg);
-  if (NEXT_INSN (save) != NULL_RTX)
+  if (NEXT_INSN (save) != NULL_RTX && !call_save_p)
     {
-      lra_assert (! call_save_p);
       if (lra_dump_file != NULL)
 	{
 	  fprintf
@@ -4934,9 +4933,8 @@  split_reg (bool before_p, int original_regno, rtx_insn *insn,
       return false;
     }
   restore = emit_spill_move (false, new_reg, original_reg);
-  if (NEXT_INSN (restore) != NULL_RTX)
+  if (NEXT_INSN (restore) != NULL_RTX && !call_save_p)
     {
-      lra_assert (! call_save_p);
       if (lra_dump_file != NULL)
 	{
 	  fprintf (lra_dump_file,
-- 
1.7.6