diff mbox

i386: add a variant peephole for decl (mem)

Message ID 56C78F87.8060705@redhat.com
State New
Headers show

Commit Message

Bernd Schmidt Feb. 19, 2016, 9:56 p.m. UTC
PR 49095 requested the following optimization:

-       movl    -120(%rax), %ecx
-       leal    -1(%rcx), %edx
-       movl    %edx, -120(%rax)
-       testl   %edx, %edx
+       subl    $1, -120(%rax)
         jne     .L92

The PR was fixed by adding a peephole, but it doesn't actually trigger 
for the code sequence quoted above. This is because the pattern expects 
to see a parallel including a clobber of CC, which is what you'd get for 
a normal add or logical operation. For lea, this does not match: the 
clobber is missing, and also the input and output operands can be different.

This shows up with some IRA cost changes I'm testing for a different PR. 
The following patch adds a variant peephole. It would be a prerequisite 
for those IRA changes so as to not regress an existing testcase. The new 
peephole triggers a few times in my collection of .i files.

Bootstrapped and tested on x86_64-linux.  Ok?



Bernd
diff mbox

Patch

	* config/i386/i386.md (operation on memory peephole): Duplicate an
	existing peephole and adapt it to match lea rather than an operation
	that clobbers CC.

Index: gcc/config/i386/i386.md
===================================================================
--- gcc/config/i386/i386.md	(revision 233451)
+++ gcc/config/i386/i386.md	(working copy)
@@ -17952,6 +17952,38 @@  (define_peephole2
 				 operands[5], const0_rtx);
 })
 
+;; Likewise for instances where we have a lea pattern.
+(define_peephole2
+  [(set (match_operand:SWI 0 "register_operand")
+	(match_operand:SWI 1 "memory_operand"))
+   (set (match_operand:SWI 3 "register_operand")
+	(plus (match_dup 0)
+	      (match_operand:SWI 2 "<nonmemory_operand>")))
+   (set (match_dup 1) (match_dup 3))
+   (set (reg FLAGS_REG) (compare (match_dup 3) (const_int 0)))]
+  "(TARGET_READ_MODIFY_WRITE || optimize_insn_for_size_p ())
+   && peep2_reg_dead_p (4, operands[3])
+   && (rtx_equal_p (operands[0], operands[3])
+       || peep2_reg_dead_p (2, operands[0]))
+   && !reg_overlap_mentioned_p (operands[0], operands[1])
+   && !reg_overlap_mentioned_p (operands[3], operands[1])
+   && !reg_overlap_mentioned_p (operands[0], operands[2])
+   && (<MODE>mode != QImode
+       || immediate_operand (operands[2], QImode)
+       || any_QIreg_operand (operands[2], QImode))
+   && ix86_match_ccmode (peep2_next_insn (3), CCGOCmode)"
+  [(parallel [(set (match_dup 4) (match_dup 5))
+	      (set (match_dup 1) (plus:SWI (match_dup 1)
+					   (match_dup 2)))])]
+{
+  operands[4] = SET_DEST (PATTERN (peep2_next_insn (3)));
+  operands[5] = gen_rtx_PLUS (<MODE>mode,
+			      copy_rtx (operands[1]),
+			      copy_rtx (operands[2]));
+  operands[5] = gen_rtx_COMPARE (GET_MODE (operands[4]),
+				 operands[5], const0_rtx);
+})
+
 (define_peephole2
   [(parallel [(set (match_operand:SWI 0 "register_operand")
 		   (match_operator:SWI 2 "plusminuslogic_operator"