Message ID | 4D010797.60004@codesourcery.com |
---|---|
State | New |
Headers | show |
On 12/09/2010 05:45 PM, Andrew Stubbs wrote: > The first problem is that CSE cannot determine that the result is > constant because the auto-variable is implicitly initialized. I have > solved this by moving up the init-regs pass to before cse2. It might be > better to move it before cse1, but that's a bigger change, so I wasn't > sure? I think that would be fine independent of this change. > The second problem is that the pattern match for ZERO_EXTRACT requires > that the operand is an immediate constant, which is never the case on > ARM (and presumably is only the case with a limited range of inputs even > on other targets). I have added code to detect known-constant input > registers. Since I can choose, I prefer Chung-Lin's patch. :) Paolo
On 09/12/10 17:47, Paolo Bonzini wrote: > On 12/09/2010 05:45 PM, Andrew Stubbs wrote: >> The first problem is that CSE cannot determine that the result is >> constant because the auto-variable is implicitly initialized. I have >> solved this by moving up the init-regs pass to before cse2. It might be >> better to move it before cse1, but that's a bigger change, so I wasn't >> sure? > > I think that would be fine independent of this change. > >> The second problem is that the pattern match for ZERO_EXTRACT requires >> that the operand is an immediate constant, which is never the case on >> ARM (and presumably is only the case with a limited range of inputs even >> on other targets). I have added code to detect known-constant input >> registers. > > Since I can choose, I prefer Chung-Lin's patch. :) So, FAOD, you think the init-regs pass should be moved, but the ZERO_EXTRACT code should be dropped in favour of the other patch? Should the init-regs be moved before cse1 or cse2? I'm happy to resubmit such a patch, but would it be applicable now, or would I have to wait for stage one? Andrew
On 12/10/2010 10:46 AM, Andrew Stubbs wrote: > > Should the init-regs be moved before cse1 or cse2? CSE1. We definitely need to catch every opportunity to get rid of the initializations, so moving the pass earlier is nice. > I'm happy to resubmit such a patch, but would it be applicable now, or > would I have to wait for stage one? Try running CSiBE or comparing a set of .i files. If you see any case in which the change stands out as particularly beneficial, it may even be fine now. Otherwise, including if there are size regressions, it should wait. Paolo
On Fri, Dec 10, 2010 at 10:56 AM, Paolo Bonzini <bonzini@gnu.org> wrote: > On 12/10/2010 10:46 AM, Andrew Stubbs wrote: >> >> Should the init-regs be moved before cse1 or cse2? > > CSE1. > > We definitely need to catch every opportunity to get rid of the > initializations, so moving the pass earlier is nice. > >> I'm happy to resubmit such a patch, but would it be applicable now, or >> would I have to wait for stage one? > > Try running CSiBE or comparing a set of .i files. If you see any case in > which the change stands out as particularly beneficial, it may even be fine > now. Otherwise, including if there are size regressions, it should wait. I think we don't want to re-order passes at this point in time. Richard. > Paolo >
On 12/09/10 09:45, Andrew Stubbs wrote: > The attached patch fixes a bug in which constant assignments to bit > fields are improperly optimized. I'm seeing this on ARM, but I imagine > it affects other targets similarly. > > The first problem is that CSE cannot determine that the result is > constant because the auto-variable is implicitly initialized. I have > solved this by moving up the init-regs pass to before cse2. It might > be better to move it before cse1, but that's a bigger change, so I > wasn't sure? If this change ends up being made, I would strongly recommend a comment about why the change was made be placed in passes.c. I would hazard a guess that the pass was placed late in the pipeline because there wasn't any value seen in running it prior to combine. The only thing I'd be concerned about would be introducing useless initializations if we move the pass earlier since fewer optimizers would have been run and thus more unexecutable paths are still in the CFG. It would be particularly interesting to run some benchmarks with only this change and see how codesize is affected. > > The second problem is that the pattern match for ZERO_EXTRACT requires > that the operand is an immediate constant, which is never the case on > ARM (and presumably is only the case with a limited range of inputs > even on other targets). I have added code to detect known-constant > input registers. This seems fairly reasonable. I believe the preferred method for queuing pending patches is to open a bug for failure to optimize, attach the patches & testcase to that bug. Then make the GCC 4.7 pending patches meta-bug depend on the failure to optimize bug. Jeff
On 10/12/10 16:00, Jeff Law wrote: > I believe the preferred method for queuing pending patches is to open > a bug for failure to optimize, attach the patches & testcase to that > bug. Then make the GCC 4.7 pending patches meta-bug depend on the > failure to optimize bug. Now filed here: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46888 Andrew
2010-12-09 Andrew Stubbs <ams@codesourcery.com> gcc/ * cse.c (cse_insn): Add support for ZERO_EXTRACT with a register source operand. * passes.c (init_optimization_passes): Move initialize_regs before cse2. --- src/gcc-mainline/gcc/cse.c | 24 ++++++++++++++++++++---- src/gcc-mainline/gcc/passes.c | 2 +- 2 files changed, 21 insertions(+), 5 deletions(-) diff --git a/src/gcc-mainline/gcc/cse.c b/src/gcc-mainline/gcc/cse.c index 3ab6b37..d283040 100644 --- a/src/gcc-mainline/gcc/cse.c +++ b/src/gcc-mainline/gcc/cse.c @@ -5036,7 +5036,6 @@ cse_insn (rtx insn) (set (zero_extract:M2 (reg:M N) (const_int C) (const_int D)) (reg:M2 O)). */ if (GET_CODE (SET_DEST (sets[i].rtl)) == ZERO_EXTRACT - && CONST_INT_P (trial) && CONST_INT_P (XEXP (SET_DEST (sets[i].rtl), 1)) && CONST_INT_P (XEXP (SET_DEST (sets[i].rtl), 2)) && REG_P (XEXP (SET_DEST (sets[i].rtl), 0)) @@ -5052,9 +5051,26 @@ cse_insn (rtx insn) unsigned int dest_hash = HASH (dest_reg, GET_MODE (dest_reg)); struct table_elt *dest_elt = lookup (dest_reg, dest_hash, GET_MODE (dest_reg)); - rtx dest_cst = NULL; + rtx dest_cst = NULL, src_cst = NULL; - if (dest_elt) + if (CONST_INT_P (trial)) + src_cst = trial; + else if (REG_P (trial)) + { + unsigned int src_hash = HASH (trial, GET_MODE (trial)); + struct table_elt *src_elt + = lookup (trial, src_hash, GET_MODE (trial)); + + if (src_elt) + for (p = src_elt->first_same_value; p; p = p->next_same_value) + if (p->is_const && CONST_INT_P (p->exp)) + { + src_cst = p->exp; + break; + } + } + + if (src_cst && dest_elt) for (p = dest_elt->first_same_value; p; p = p->next_same_value) if (p->is_const && CONST_INT_P (p->exp)) { @@ -5076,7 +5092,7 @@ cse_insn (rtx insn) else mask = ((HOST_WIDE_INT) 1 << INTVAL (width)) - 1; val &= ~(mask << shift); - val |= (INTVAL (trial) & mask) << shift; + val |= (INTVAL (src_cst) & mask) << shift; val = trunc_int_for_mode (val, GET_MODE (dest_reg)); validate_unshare_change (insn, &SET_DEST (sets[i].rtl), dest_reg, 1); diff --git a/src/gcc-mainline/gcc/passes.c b/src/gcc-mainline/gcc/passes.c index 4be61a9..c1c656d 100644 --- a/src/gcc-mainline/gcc/passes.c +++ b/src/gcc-mainline/gcc/passes.c @@ -990,11 +990,11 @@ init_optimization_passes (void) } NEXT_PASS (pass_web); NEXT_PASS (pass_rtl_cprop); + NEXT_PASS (pass_initialize_regs); NEXT_PASS (pass_cse2); NEXT_PASS (pass_rtl_dse1); NEXT_PASS (pass_rtl_fwprop_addr); NEXT_PASS (pass_inc_dec); - NEXT_PASS (pass_initialize_regs); NEXT_PASS (pass_ud_rtl_dce); NEXT_PASS (pass_combine); NEXT_PASS (pass_if_after_combine);