| Submitter | Uros Bizjak |
|---|---|
| Date | July 6, 2012, 10:55 a.m. |
| Message ID | <CAFULd4aurbAZkrfE5k+cmaxpycHFT54Aaxbj11iEifptpKFTEQ@mail.gmail.com> |
| Download | mbox | patch |
| Permalink | /patch/169411/ |
| State | New |
| Headers | show |
Comments
Uros Bizjak <ubizjak@gmail.com> writes: > Hello! > > Sometimes, gcc generates: > > leaq (%rbx,%rax), %rax > > that is in fact equivalent (modulo flags reg clobber) to: > > addq %rbx, %rax > > Attached patch adds additional peephole2 patterns that convert LEA to > ADD when second operand of PLUS RTX matches output operand. Are you sure this is a win? In the past on some CPUs this was done intentionally because the AGU had more execution resources than the ALU. -Andi
On Fri, Jul 6, 2012 at 5:24 PM, Andi Kleen <andi@firstfloor.org> wrote: > Uros Bizjak <ubizjak@gmail.com> writes: >> Sometimes, gcc generates: >> >> leaq (%rbx,%rax), %rax >> >> that is in fact equivalent (modulo flags reg clobber) to: >> >> addq %rbx, %rax >> >> Attached patch adds additional peephole2 patterns that convert LEA to >> ADD when second operand of PLUS RTX matches output operand. > > Are you sure this is a win? > > In the past on some CPUs this was done intentionally because the AGU > had more execution resources than the ALU. This is just a case of commutative operand handing, following existing approach. Please note that there is a separate pass that converts ADDs to LEAs when appropriate. Uros.
Patch
Index: config/i386/i386.md =================================================================== --- config/i386/i386.md (revision 189310) +++ config/i386/i386.md (working copy) @@ -17301,6 +17301,14 @@ (clobber (reg:CC FLAGS_REG))])]) (define_peephole2 + [(set (match_operand:SWI48 0 "register_operand") + (plus:SWI48 (match_operand:SWI48 1 "<nonmemory_operand>") + (match_dup 0)))] + "peep2_regno_dead_p (0, FLAGS_REG)" + [(parallel [(set (match_dup 0) (plus:SWI48 (match_dup 0) (match_dup 1))) + (clobber (reg:CC FLAGS_REG))])]) + +(define_peephole2 [(set (match_operand:SI 0 "register_operand") (subreg:SI (plus:DI (match_operand:DI 1 "register_operand") (match_operand:DI 2 "nonmemory_operand")) 0))] @@ -17312,6 +17320,17 @@ "operands[2] = gen_lowpart (SImode, operands[2]);") (define_peephole2 + [(set (match_operand:SI 0 "register_operand") + (subreg:SI (plus:DI (match_operand:DI 1 "nonmemory_operand") + (match_operand:DI 2 "register_operand")) 0))] + "TARGET_64BIT + && peep2_regno_dead_p (0, FLAGS_REG) + && REGNO (operands[0]) == REGNO (operands[2])" + [(parallel [(set (match_dup 0) (plus:SI (match_dup 0) (match_dup 1))) + (clobber (reg:CC FLAGS_REG))])] + "operands[1] = gen_lowpart (SImode, operands[1]);") + +(define_peephole2 [(set (match_operand:SWI48 0 "register_operand") (mult:SWI48 (match_dup 0) (match_operand:SWI48 1 "const_int_operand")))]
Hello! Sometimes, gcc generates: leaq (%rbx,%rax), %rax that is in fact equivalent (modulo flags reg clobber) to: addq %rbx, %rax Attached patch adds additional peephole2 patterns that convert LEA to ADD when second operand of PLUS RTX matches output operand. 2012-07-06 Uros Bizjak <ubizjak@gmail.com> * config/i386/i386.md (simple lea to add peephole): Also transform RTXes where second PLUS operand matches output. Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN. Uros.