| Submitter | Uros Bizjak |
|---|---|
| Date | March 12, 2012, 8:52 a.m. |
| Message ID | <CAFULd4Ztr=OOOc8DknWzY0FqXDW_OZKKkuSsO8oO0z2vBayEPA@mail.gmail.com> |
| Download | mbox | patch |
| Permalink | /patch/146049/ |
| State | New |
| Headers | show |
Comments
Il 12/03/2012 09:52, Uros Bizjak ha scritto: > +(define_peephole2 > + [(parallel [(set (reg FLAGS_REG) (match_operand 0 "" "")) > + (match_operand 4 "" "")]) > + (set (match_operand:QI 1 "register_operand" "") > + (match_operator:QI 2 "ix86_comparison_operator" > + [(reg FLAGS_REG) (const_int 0)])) > + (set (match_operand 3 "q_regs_operand" "") > + (zero_extend (match_dup 1)))] > + "(peep2_reg_dead_p (3, operands[1]) > + || operands_match_p (operands[1], operands[3])) > + && ! reg_overlap_mentioned_p (operands[3], operands[0])" I understand that you're assuming the shape of operands[4] to be the same as operands[3], but would it be preferrable to add another overlap check on operands[4]? For example the transformation is invalid if you had an overlap between operands[3] and the destination of operands[4]. Paolo
On Mon, Mar 12, 2012 at 11:13 AM, Paolo Bonzini <bonzini@gnu.org> wrote: > Il 12/03/2012 09:52, Uros Bizjak ha scritto: >> +(define_peephole2 >> + [(parallel [(set (reg FLAGS_REG) (match_operand 0 "" "")) >> + (match_operand 4 "" "")]) >> + (set (match_operand:QI 1 "register_operand" "") >> + (match_operator:QI 2 "ix86_comparison_operator" >> + [(reg FLAGS_REG) (const_int 0)])) >> + (set (match_operand 3 "q_regs_operand" "") >> + (zero_extend (match_dup 1)))] >> + "(peep2_reg_dead_p (3, operands[1]) >> + || operands_match_p (operands[1], operands[3])) >> + && ! reg_overlap_mentioned_p (operands[3], operands[0])" > > I understand that you're assuming the shape of operands[4] to be the > same as operands[3], but would it be preferrable to add another overlap > check on operands[4]? > > For example the transformation is invalid if you had an overlap between > operands[3] and the destination of operands[4]. The destination of operands[4] _always_ matches one of operands inside operand[0]. All arithmetic insn that set flags are destructive on x86. Uros.
Patch
Index: i386.md =================================================================== --- i386.md (revision 185201) +++ i386.md (working copy) @@ -11170,6 +11170,27 @@ ix86_expand_clear (operands[3]); }) +(define_peephole2 + [(parallel [(set (reg FLAGS_REG) (match_operand 0 "" "")) + (match_operand 4 "" "")]) + (set (match_operand:QI 1 "register_operand" "") + (match_operator:QI 2 "ix86_comparison_operator" + [(reg FLAGS_REG) (const_int 0)])) + (set (match_operand 3 "q_regs_operand" "") + (zero_extend (match_dup 1)))] + "(peep2_reg_dead_p (3, operands[1]) + || operands_match_p (operands[1], operands[3])) + && ! reg_overlap_mentioned_p (operands[3], operands[0])" + [(parallel [(set (match_dup 5) (match_dup 0)) + (match_dup 4)]) + (set (strict_low_part (match_dup 6)) + (match_dup 2))] +{ + operands[5] = gen_rtx_REG (GET_MODE (operands[0]), FLAGS_REG); + operands[6] = gen_lowpart (QImode, operands[3]); + ix86_expand_clear (operands[3]); +}) + ;; Similar, but match zero extend with andsi3. (define_peephole2 @@ -11190,6 +11211,28 @@ operands[5] = gen_lowpart (QImode, operands[3]); ix86_expand_clear (operands[3]); }) + +(define_peephole2 + [(parallel [(set (reg FLAGS_REG) (match_operand 0 "" "")) + (match_operand 4 "" "")]) + (set (match_operand:QI 1 "register_operand" "") + (match_operator:QI 2 "ix86_comparison_operator" + [(reg FLAGS_REG) (const_int 0)])) + (parallel [(set (match_operand 3 "q_regs_operand" "") + (zero_extend (match_dup 1))) + (clobber (reg:CC FLAGS_REG))])] + "(peep2_reg_dead_p (3, operands[1]) + || operands_match_p (operands[1], operands[3])) + && ! reg_overlap_mentioned_p (operands[3], operands[0])" + [(parallel [(set (match_dup 5) (match_dup 0)) + (match_dup 4)]) + (set (strict_low_part (match_dup 6)) + (match_dup 2))] +{ + operands[5] = gen_rtx_REG (GET_MODE (operands[0]), FLAGS_REG); + operands[6] = gen_lowpart (QImode, operands[3]); + ix86_expand_clear (operands[3]); +}) ;; Call instructions.
Hello! Attached patch improves setCC + movzbl to xor + setcc peephole2 to also handle CC setting arithmetic instructions. 2012-03-12 Uros Bizjak <ubizjak@gmail.com> * config/i386/i386.md (setcc + movzbl to xor + setcc peephole2): Also convert sequences with CC setting arithmetic instruction. Tested on x86_64-pc-linux-gnu {, -m32}, committed to mainline SVN. Uros.