Patchwork [i386] : Convert some more simple LEAs to ADD.

login
register
mail settings
Submitter Uros Bizjak
Date July 6, 2012, 10:55 a.m.
Message ID <CAFULd4aurbAZkrfE5k+cmaxpycHFT54Aaxbj11iEifptpKFTEQ@mail.gmail.com>
Download mbox | patch
Permalink /patch/169411/
State New
Headers show

Comments

Uros Bizjak - July 6, 2012, 10:55 a.m.
Hello!

Sometimes, gcc generates:

leaq    (%rbx,%rax), %rax

that is in fact equivalent (modulo flags reg clobber) to:

addq    %rbx, %rax

Attached patch adds additional peephole2 patterns that convert LEA to
ADD when second operand of PLUS RTX matches output operand.

2012-07-06  Uros Bizjak  <ubizjak@gmail.com>

	* config/i386/i386.md (simple lea to add peephole): Also transform
	RTXes where second PLUS operand matches output.

Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN.

Uros.
Andi Kleen - July 6, 2012, 3:24 p.m.
Uros Bizjak <ubizjak@gmail.com> writes:

> Hello!
>
> Sometimes, gcc generates:
>
> leaq    (%rbx,%rax), %rax
>
> that is in fact equivalent (modulo flags reg clobber) to:
>
> addq    %rbx, %rax
>
> Attached patch adds additional peephole2 patterns that convert LEA to
> ADD when second operand of PLUS RTX matches output operand.

Are you sure this is a win? 

In the past on some CPUs this was done intentionally because the AGU 
had more execution resources than the ALU.

-Andi
Uros Bizjak - July 6, 2012, 9:25 p.m.
On Fri, Jul 6, 2012 at 5:24 PM, Andi Kleen <andi@firstfloor.org> wrote:
> Uros Bizjak <ubizjak@gmail.com> writes:

>> Sometimes, gcc generates:
>>
>> leaq    (%rbx,%rax), %rax
>>
>> that is in fact equivalent (modulo flags reg clobber) to:
>>
>> addq    %rbx, %rax
>>
>> Attached patch adds additional peephole2 patterns that convert LEA to
>> ADD when second operand of PLUS RTX matches output operand.
>
> Are you sure this is a win?
>
> In the past on some CPUs this was done intentionally because the AGU
> had more execution resources than the ALU.

This is just a case of commutative operand handing, following existing
approach. Please note that there is a separate pass that converts ADDs
to LEAs when appropriate.

Uros.

Patch

Index: config/i386/i386.md
===================================================================
--- config/i386/i386.md	(revision 189310)
+++ config/i386/i386.md	(working copy)
@@ -17301,6 +17301,14 @@ 
 	      (clobber (reg:CC FLAGS_REG))])])
 
 (define_peephole2
+  [(set (match_operand:SWI48 0 "register_operand")
+  	(plus:SWI48 (match_operand:SWI48 1 "<nonmemory_operand>")
+		    (match_dup 0)))]
+  "peep2_regno_dead_p (0, FLAGS_REG)"
+  [(parallel [(set (match_dup 0) (plus:SWI48 (match_dup 0) (match_dup 1)))
+	      (clobber (reg:CC FLAGS_REG))])])
+
+(define_peephole2
   [(set (match_operand:SI 0 "register_operand")
   	(subreg:SI (plus:DI (match_operand:DI 1 "register_operand")
 			    (match_operand:DI 2 "nonmemory_operand")) 0))]
@@ -17312,6 +17320,17 @@ 
   "operands[2] = gen_lowpart (SImode, operands[2]);")
 
 (define_peephole2
+  [(set (match_operand:SI 0 "register_operand")
+  	(subreg:SI (plus:DI (match_operand:DI 1 "nonmemory_operand")
+			    (match_operand:DI 2 "register_operand")) 0))]
+  "TARGET_64BIT
+   && peep2_regno_dead_p (0, FLAGS_REG)
+   && REGNO (operands[0]) == REGNO (operands[2])"
+  [(parallel [(set (match_dup 0) (plus:SI (match_dup 0) (match_dup 1)))
+	      (clobber (reg:CC FLAGS_REG))])]
+  "operands[1] = gen_lowpart (SImode, operands[1]);")
+
+(define_peephole2
   [(set (match_operand:SWI48 0 "register_operand")
   	(mult:SWI48 (match_dup 0)
 		    (match_operand:SWI48 1 "const_int_operand")))]