Message ID | CA+yXCZCO=aabNSt-gJuj==bH7i9dNPr6-T+oz4jkvwdJAdfpfA@mail.gmail.com |
---|---|
State | New |
Headers | show |
On Mon, Jan 5, 2015 at 3:44 PM, Kito Cheng <kito.cheng@gmail.com> wrote: > Hi Vladimir: > This patch has a discusses with you in May 2014, this patch is about > the caller-save register store and restore instruction generation, the > current LRA implementation will miss caller-save store/restore > instruction if need one more instruction. > > You said you will investigate for this on IRC, so I don't send the > patch last year, however ARM guys seem got this problem too, so I > think it's time to send this patch :) > > ChangeLog > > 2015-01-05 Kito Cheng <kito@0xlab.org> > > * lra-constraints.c (split_reg): Fix caller-save store/restore > instruction generation. Hi, Thanks for saving my work, I was going to send this exact patch. Note that we have PR64348 tracking the issue now. Thanks, bin
On 01/05/15 00:44, Kito Cheng wrote: > Hi Vladimir: > This patch has a discusses with you in May 2014, this patch is about > the caller-save register store and restore instruction generation, the > current LRA implementation will miss caller-save store/restore > instruction if need one more instruction. > > You said you will investigate for this on IRC, so I don't send the > patch last year, however ARM guys seem got this problem too, so I > think it's time to send this patch :) > > ChangeLog > > 2015-01-05 Kito Cheng <kito@0xlab.org> > > * lra-constraints.c (split_reg): Fix caller-save store/restore > instruction generation. Please reference PR64348 in the ChangeLog entry. Please include a testcase if there isn't one in the regression suite already. Please indicate what platform this patch was bootstrapped and regression tested on. The dumping code immediately after the assert you removed has code like this in both cases: fprintf (lra_dump_file, " Rejecting split %d->%d " "resulting in > 2 %s restore insns:\n", original_regno, REGNO (new_reg), call_save_p ? "call" : ""); Testing call_save_p here won't make any sense after your patch. I'll let Vlad chime in on the correctness of allowing multi register saves/restores in this code. Jeff
On 2015-01-05 12:31 PM, Jeff Law wrote: > On 01/05/15 00:44, Kito Cheng wrote: >> Hi Vladimir: >> This patch has a discusses with you in May 2014, this patch is about >> the caller-save register store and restore instruction generation, the >> current LRA implementation will miss caller-save store/restore >> instruction if need one more instruction. >> >> You said you will investigate for this on IRC, so I don't send the >> patch last year, however ARM guys seem got this problem too, so I >> think it's time to send this patch :) >> >> ChangeLog >> >> 2015-01-05 Kito Cheng <kito@0xlab.org> >> >> * lra-constraints.c (split_reg): Fix caller-save store/restore >> instruction generation. > Please reference PR64348 in the ChangeLog entry. > > Please include a testcase if there isn't one in the regression suite > already. > > Please indicate what platform this patch was bootstrapped and > regression tested on. > > The dumping code immediately after the assert you removed has code > like this in both cases: > > > > fprintf (lra_dump_file, > " Rejecting split %d->%d " > "resulting in > 2 %s restore insns:\n", > original_regno, REGNO (new_reg), call_save_p ? > "call" : ""); > > Testing call_save_p here won't make any sense after your patch. > > I'll let Vlad chime in on the correctness of allowing multi register > saves/restores in this code. > The solution itself is ok. Prohibiting generation of more one insn was intended for inheritance only as inheritance transformation can be undone when the inheritance pseudo does not get a hard register. Undoing multi-register splitting is difficult and also such splitting is doubtedly profitable. Splitting for save/restore is never undone. So it is ok for this case to generate multi-register saves/restores. Kito, Jeff wrote reasonable changes for the patch. Please, do them and you can commit the patch. Thanks.
On Tue, Jan 6, 2015 at 7:36 AM, Vladimir Makarov <vmakarov@redhat.com> wrote: > > On 2015-01-05 12:31 PM, Jeff Law wrote: >> >> On 01/05/15 00:44, Kito Cheng wrote: >>> >>> Hi Vladimir: >>> This patch has a discusses with you in May 2014, this patch is about >>> the caller-save register store and restore instruction generation, the >>> current LRA implementation will miss caller-save store/restore >>> instruction if need one more instruction. >>> >>> You said you will investigate for this on IRC, so I don't send the >>> patch last year, however ARM guys seem got this problem too, so I >>> think it's time to send this patch :) >>> >>> ChangeLog >>> >>> 2015-01-05 Kito Cheng <kito@0xlab.org> >>> >>> * lra-constraints.c (split_reg): Fix caller-save store/restore >>> instruction generation. >> >> Please reference PR64348 in the ChangeLog entry. >> >> Please include a testcase if there isn't one in the regression suite >> already. >> >> Please indicate what platform this patch was bootstrapped and regression >> tested on. >> >> The dumping code immediately after the assert you removed has code like >> this in both cases: >> >> >> >> fprintf (lra_dump_file, >> " Rejecting split %d->%d " >> "resulting in > 2 %s restore insns:\n", >> original_regno, REGNO (new_reg), call_save_p ? "call" : >> ""); >> >> Testing call_save_p here won't make any sense after your patch. >> >> I'll let Vlad chime in on the correctness of allowing multi register >> saves/restores in this code. >> > The solution itself is ok. Prohibiting generation of more one insn was > intended for inheritance only as inheritance transformation can be undone > when the inheritance pseudo does not get a hard register. Undoing > multi-register splitting is difficult and also such splitting is doubtedly > profitable. > > Splitting for save/restore is never undone. So it is ok for this case to > generate multi-register saves/restores. > > Kito, Jeff wrote reasonable changes for the patch. Please, do them and you > can commit the patch. Hi Vlad, As for this specific case in PR64348, dump IR crossing function call is as below: 430: [sfp:SI-0x30]=r989:TI#0 432: [r1706:SI+0x4]=r989:TI#4 434: [r1706:SI+0x8]=r989:TI#8 436: [r1706:SI+0xc]=r989:TI#12 441: r0:DI=call [`__aeabi_idivmod'] argc:0 REG_UNUSED r0:SI REG_CALL_DECL `__aeabi_idivmod' REG_EH_REGION 0xffffffff80000000 437: r1007:SI=sign_extend(r989:TI#0) REG_DEAD r989:TI Save/restore are introduced because of use of r989:#0 in insn 437. Register r989 is TImode register and assigned to r2/r3/r4/r5. I can think about two possible improvements to LRA splitter. 1) According to ARM EABI, only r2/r3 need to be saved/restored; 2) In this case, we only need to save/restore r989:TI#0, which is r2. In fact, it has already been saved to stack in insn 430, what we need is to restore r989:TI#0 after function call. What do you think about these? Thanks, bin > > Thanks. >
On 2015-01-05 8:40 PM, Bin.Cheng wrote: > On Tue, Jan 6, 2015 at 7:36 AM, Vladimir Makarov <vmakarov@redhat.com> wrote: >> On 2015-01-05 12:31 PM, Jeff Law wrote: >>> On 01/05/15 00:44, Kito Cheng wrote: >>>> Hi Vladimir: >>>> This patch has a discusses with you in May 2014, this patch is about >>>> the caller-save register store and restore instruction generation, the >>>> current LRA implementation will miss caller-save store/restore >>>> instruction if need one more instruction. >>>> >>>> You said you will investigate for this on IRC, so I don't send the >>>> patch last year, however ARM guys seem got this problem too, so I >>>> think it's time to send this patch :) >>>> >>>> ChangeLog >>>> >>>> 2015-01-05 Kito Cheng <kito@0xlab.org> >>>> >>>> * lra-constraints.c (split_reg): Fix caller-save store/restore >>>> instruction generation. >>> Please reference PR64348 in the ChangeLog entry. >>> >>> Please include a testcase if there isn't one in the regression suite >>> already. >>> >>> Please indicate what platform this patch was bootstrapped and regression >>> tested on. >>> >>> The dumping code immediately after the assert you removed has code like >>> this in both cases: >>> >>> >>> >>> fprintf (lra_dump_file, >>> " Rejecting split %d->%d " >>> "resulting in > 2 %s restore insns:\n", >>> original_regno, REGNO (new_reg), call_save_p ? "call" : >>> ""); >>> >>> Testing call_save_p here won't make any sense after your patch. >>> >>> I'll let Vlad chime in on the correctness of allowing multi register >>> saves/restores in this code. >>> >> The solution itself is ok. Prohibiting generation of more one insn was >> intended for inheritance only as inheritance transformation can be undone >> when the inheritance pseudo does not get a hard register. Undoing >> multi-register splitting is difficult and also such splitting is doubtedly >> profitable. >> >> Splitting for save/restore is never undone. So it is ok for this case to >> generate multi-register saves/restores. >> >> Kito, Jeff wrote reasonable changes for the patch. Please, do them and you >> can commit the patch. > Hi Vlad, > As for this specific case in PR64348, dump IR crossing function call > is as below: > > 430: [sfp:SI-0x30]=r989:TI#0 > 432: [r1706:SI+0x4]=r989:TI#4 > 434: [r1706:SI+0x8]=r989:TI#8 > 436: [r1706:SI+0xc]=r989:TI#12 > 441: r0:DI=call [`__aeabi_idivmod'] argc:0 > REG_UNUSED r0:SI > REG_CALL_DECL `__aeabi_idivmod' > REG_EH_REGION 0xffffffff80000000 > 437: r1007:SI=sign_extend(r989:TI#0) > REG_DEAD r989:TI > > Save/restore are introduced because of use of r989:#0 in insn 437. > Register r989 is TImode register and assigned to r2/r3/r4/r5. I can > think about two possible improvements to LRA splitter. 1) According > to ARM EABI, only r2/r3 need to be saved/restored; 2) In this case, we > only need to save/restore r989:TI#0, which is r2. In fact, it has > already been saved to stack in insn 430, what we need is to restore > r989:TI#0 after function call. > > What do you think about these? > > Thanks for the interesting case. It would be nice to implement this but it is hard. For saving only r2 we need live range analysis on sub-register level which is complicated (even IRA has such code only at most for 2-registers pseudos). For reusing memory from insn 430 is complicated too if it is not equiv memory. May be adding hard registers explicitly to this code could help (save/restore splitting is done only on the 1st LRA sub-pass; after that pseudos can get a call-clobbered hard register only if it does not intersect calls; if pseudo r989 is spilled on later LRA sub-passes we could remove code involving explicit subregisters). I'll think about this.
From 434dc294cf3f2267eed89d27b34c53c079e2b25a Mon Sep 17 00:00:00 2001 From: Kito Cheng <kito@andestech.com> Date: Thu, 29 May 2014 23:53:23 +0800 Subject: [PATCH] Fix caller-save store/restore instruction for large mode in lra - The original code assume store/restore will always have only one insn, it's will fail if you split in move pattern!!! --- gcc/lra-constraints.c | 6 ++---- 1 files changed, 2 insertions(+), 4 deletions(-) diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c index 382281c..4dcf553 100644 --- a/gcc/lra-constraints.c +++ b/gcc/lra-constraints.c @@ -4918,9 +4918,8 @@ split_reg (bool before_p, int original_regno, rtx_insn *insn, reg_renumber[REGNO (new_reg)] = hard_regno; } save = emit_spill_move (true, new_reg, original_reg); - if (NEXT_INSN (save) != NULL_RTX) + if (NEXT_INSN (save) != NULL_RTX && !call_save_p) { - lra_assert (! call_save_p); if (lra_dump_file != NULL) { fprintf @@ -4934,9 +4933,8 @@ split_reg (bool before_p, int original_regno, rtx_insn *insn, return false; } restore = emit_spill_move (false, new_reg, original_reg); - if (NEXT_INSN (restore) != NULL_RTX) + if (NEXT_INSN (restore) != NULL_RTX && !call_save_p) { - lra_assert (! call_save_p); if (lra_dump_file != NULL) { fprintf (lra_dump_file, -- 1.7.6