| Submitter | Richard Sandiford |
|---|---|
| Date | Aug. 8, 2010, 2:37 p.m. |
| Message ID | <871va9qifd.fsf@firetop.home> |
| Download | mbox | patch |
| Permalink | /patch/61213/ |
| State | New |
| Headers | show |
Comments
On 08/08/2010 10:37 AM, Richard Sandiford wrote: > Which brings me to a question. If it takes several consecutive > registers to store a particular mode, am I right in thinking that it is > the ALLOCNO_CONFLICT_HARD_REG_COSTS of the _first_ of those registers > (and only the first of those registers) that matters? For example. is > the cost of storing a quadword value in registers (4, 5, 6, 7) reflected > solely in ALLOCNO_CONFLICT_HARD_REG_COSTS (a)[4]? That looks to be how > the copy costs for hard registers works, and it seems to make conceptual > sense. Adding the cost for every register in the group would make an > overlapping register group like (5, 6, 7, 8) seem better than an > independent group like (8, 9, 10, 11), whereas I imagine the former > would often be worse (harder reloads). > > If the answer is "no, you're wrong", then the patch is wrong too. > But if the answer is "yes", then I'm puzzled why the code above is > looping at all. We've already established by this point (or at least, > are already confident enough to assume) that the operand needs a single > register. So why not just work out what that register is, and try to > calculate the corresponding register in the allocno's mode? > > You are completely right. > I realise this change probably conflicts with the ira improvements > branch, so I'm happy to sit on it until those changes go in. > But I think the same register-selection logic should apply regardless > of whether we're using cover classes. > > There will be only a tiny conflict on ira_get_register_move_cost part which I can easy resolve. > Either way, I'd like to apply the patch to 4.5 branch at some point, > since this is a regression from 4.4. > > > The patch is ok to commit to the trunk. Please, don't wait and apply the patch because I think ira-improv branch will be merged in 1-3 weeks. > gcc/ > PR rtl-optimization/43358 > * ira-lives.c (process_single_reg_class_operands): Adjust the costs > of a single hard register, using simplify_subreg_regno to decide > what that register should be. > Richard, thanks for working on the PR and providing the patch.
"Vladimir N. Makarov" <vmakarov@redhat.com> writes: > On 08/08/2010 10:37 AM, Richard Sandiford wrote: >> I realise this change probably conflicts with the ira improvements >> branch, so I'm happy to sit on it until those changes go in. >> But I think the same register-selection logic should apply regardless >> of whether we're using cover classes. >> > There will be only a tiny conflict on ira_get_register_move_cost part > which I can easy resolve. > >> Either way, I'd like to apply the patch to 4.5 branch at some point, >> since this is a regression from 4.4. >> > The patch is ok to commit to the trunk. Thanks Vlad, and sorry for the churn. Applied as 163249. I'll wait to see how things go before backporting to 4.5. I have a horrible feeling I'll be seeing a new PR number tomorrow morning... Richard
Patch
Index: gcc/ira-lives.c =================================================================== --- gcc/ira-lives.c 2010-08-08 10:35:06.000000000 +0100 +++ gcc/ira-lives.c 2010-08-08 10:35:08.000000000 +0100 @@ -896,7 +896,7 @@ ira_implicitly_set_insn_hard_regs (HARD_ static void process_single_reg_class_operands (bool in_p, int freq) { - int i, regno, cost; + int i, regno; unsigned int px; enum reg_class cl; rtx operand; @@ -923,32 +923,46 @@ process_single_reg_class_operands (bool if (REG_P (operand) && (regno = REGNO (operand)) >= FIRST_PSEUDO_REGISTER) { - enum machine_mode mode; enum reg_class cover_class; operand_a = ira_curr_regno_allocno_map[regno]; - mode = ALLOCNO_MODE (operand_a); cover_class = ALLOCNO_COVER_CLASS (operand_a); if (ira_class_subset_p[cl][cover_class] - && ira_class_hard_regs_num[cl] != 0 - && (ira_class_hard_reg_index[cover_class] - [ira_class_hard_regs[cl][0]]) >= 0 - && reg_class_size[cl] <= (unsigned) CLASS_MAX_NREGS (cl, mode)) + && ira_class_hard_regs_num[cl] != 0) { - int i, size; - cost - = (freq - * (in_p - ? ira_get_register_move_cost (mode, cover_class, cl) - : ira_get_register_move_cost (mode, cl, cover_class))); - ira_allocate_and_set_costs - (&ALLOCNO_CONFLICT_HARD_REG_COSTS (operand_a), cover_class, 0); - size = ira_reg_class_nregs[cover_class][mode]; - for (i = 0; i < size; i++) - ALLOCNO_CONFLICT_HARD_REG_COSTS (operand_a) - [ira_class_hard_reg_index - [cover_class][ira_class_hard_regs[cl][i]]] - -= cost; + /* View the desired allocation of OPERAND as: + + (REG:YMODE YREGNO), + + a simplification of: + + (subreg:YMODE (reg:XMODE XREGNO) OFFSET). */ + enum machine_mode ymode, xmode; + int xregno, yregno; + HOST_WIDE_INT offset; + + xmode = recog_data.operand_mode[i]; + xregno = ira_class_hard_regs[cl][0]; + ymode = ALLOCNO_MODE (operand_a); + offset = subreg_lowpart_offset (ymode, xmode); + yregno = simplify_subreg_regno (xregno, xmode, offset, ymode); + if (yregno >= 0 + && ira_class_hard_reg_index[cover_class][yregno] >= 0) + { + int cost; + + ira_allocate_and_set_costs + (&ALLOCNO_CONFLICT_HARD_REG_COSTS (operand_a), + cover_class, 0); + cost + = (freq + * (in_p + ? ira_get_register_move_cost (xmode, cover_class, cl) + : ira_get_register_move_cost (xmode, cl, + cover_class))); + ALLOCNO_CONFLICT_HARD_REG_COSTS (operand_a) + [ira_class_hard_reg_index[cover_class][yregno]] -= cost; + } } }
PR 43358 is about an ICE when freeing memory in IRA. The testcase in that PR was derived from linux, but the same thing can be seem with gcc.c-torture/execute/pr23135.c when compiled with -O or above on mips64-linux-gnu/-mabi=32. The immediate cause of the ICE is a write into index -1 of an ALLOCNO_CONFLICT_HARD_REG_COSTS array, which clobbers the pool id. But I think the problem is a little deeper than that. The code in question is part of the single-register-class handling: if (GET_CODE (operand) == SUBREG) operand = SUBREG_REG (operand); if (REG_P (operand) && (regno = REGNO (operand)) >= FIRST_PSEUDO_REGISTER) { enum machine_mode mode; enum reg_class cover_class; operand_a = ira_curr_regno_allocno_map[regno]; mode = ALLOCNO_MODE (operand_a); cover_class = ALLOCNO_COVER_CLASS (operand_a); if (ira_class_subset_p[cl][cover_class] && ira_class_hard_regs_num[cl] != 0 && (ira_class_hard_reg_index[cover_class] [ira_class_hard_regs[cl][0]]) >= 0 && reg_class_size[cl] <= (unsigned) CLASS_MAX_NREGS (cl, mode)) { int i, size; cost = (freq * (in_p ? ira_get_register_move_cost (mode, cover_class, cl) : ira_get_register_move_cost (mode, cl, cover_class))); ira_allocate_and_set_costs (&ALLOCNO_CONFLICT_HARD_REG_COSTS (operand_a), cover_class, 0); size = ira_reg_class_nregs[cover_class][mode]; for (i = 0; i < size; i++) ALLOCNO_CONFLICT_HARD_REG_COSTS (operand_a) [ira_class_hard_reg_index [cover_class][ira_class_hard_regs[cl][i]]] -= cost; } } What we have in the testcase is an SImode subreg of a DImode pseudo. The code has correctly realised that the pattern requires the SImode subreg to be in a single register class (MD1_REG), and is trying to reflect that restriction in the costs of the DImode allocno. Conceptually, this is exactly what we want to happen in the testcase. However, the loop is iterating over the size of the _allocno_, and expecting every register to belong to the single-register class (the ira_class_hard_regs[cl][i] expression). Because the allocno is wider than the operand, that assumption doesn't hold. The preferred register for the allocno (as opposed to the operand) is outside MD1_REG, but is still in the cover class. Which brings me to a question. If it takes several consecutive registers to store a particular mode, am I right in thinking that it is the ALLOCNO_CONFLICT_HARD_REG_COSTS of the _first_ of those registers (and only the first of those registers) that matters? For example. is the cost of storing a quadword value in registers (4, 5, 6, 7) reflected solely in ALLOCNO_CONFLICT_HARD_REG_COSTS (a)[4]? That looks to be how the copy costs for hard registers works, and it seems to make conceptual sense. Adding the cost for every register in the group would make an overlapping register group like (5, 6, 7, 8) seem better than an independent group like (8, 9, 10, 11), whereas I imagine the former would often be worse (harder reloads). If the answer is "no, you're wrong", then the patch is wrong too. But if the answer is "yes", then I'm puzzled why the code above is looping at all. We've already established by this point (or at least, are already confident enough to assume) that the operand needs a single register. So why not just work out what that register is, and try to calculate the corresponding register in the allocno's mode? That's what the patch below does. simplify_subreg_regno deals with (is supposed to deal with) cases where the subreg is invalid, such as if the register number is out of bounds, or if the mode change is rejected by the target. And we do of course still need to check that the result belongs to the allocno's class. I checked that the lower costs are still applied for this testcase and for others like it. I also compared the testsuite output for -O2 before and after the patch, both on mips64-linux-gnu and x86_64-linux-gnu. There were three changes in the mips64-linux-gnu output and no changes in the x86_64-linux-gnu output. Two of the three changes were significant improvements; they were cases where we used to copy a GPR into LO and back again for no reason. The other change was simply a renumbering of GPRs. I realise this change probably conflicts with the ira improvements branch, so I'm happy to sit on it until those changes go in. But I think the same register-selection logic should apply regardless of whether we're using cover classes. Either way, I'd like to apply the patch to 4.5 branch at some point, since this is a regression from 4.4. Richard gcc/ PR rtl-optimization/43358 * ira-lives.c (process_single_reg_class_operands): Adjust the costs of a single hard register, using simplify_subreg_regno to decide what that register should be.