Message ID | AANLkTikKQrpf_o4rkyiH5NXnUsHC-E-0eiQftXcItR3s@mail.gmail.com |
---|---|
State | New |
Headers | show |
On 02/07/10 23:06, Carrot Wei wrote: > Hi Richard > > The following patch has been tested on arm qemu. Could you take a look? > > thanks > Guozhi > > ChangeLog: > 2010-07-02 Wei Guozhi<carrot@google.com> > > * thumb2.md (peephole2 to convert zero_extract/compare of lowest bits > to lshift/compare): New. > This is basically fine. But one minor nit. A lsl ...,#0 in Thumb1 isn't a left shift at all, but a movs (and a picky assembler might reject the construct, as some versions of the ARM ARM state that the range of the shift should be in the range 1...31). Fortunately, I doubt this pattern would ever be generated for that case, since a zero_extract of all 32-bits of an SImode value would simplify into the original operand. So please change the limit on op2 to be less than 32. OK with that change. R. > > Index: thumb2.md > =================================================================== > --- thumb2.md (revision 161725) > +++ thumb2.md (working copy) > @@ -1501,3 +1501,29 @@ > VOIDmode, operands[0], const0_rtx); > ") > > +(define_peephole2 > + [(set (match_operand:CC_NOOV 0 "cc_register" "") > + (compare:CC_NOOV (zero_extract:SI > + (match_operand:SI 1 "low_register_operand" "") > + (match_operand:SI 2 "const_int_operand" "") > + (const_int 0)) > + (const_int 0))) > + (match_scratch:SI 3 "l") > + (set (pc) > + (if_then_else (match_operator:CC_NOOV 4 "equality_operator" > + [(match_dup 0) (const_int 0)]) > + (match_operand 5 "" "") > + (match_operand 6 "" "")))] > + "TARGET_THUMB2 > +&& (INTVAL (operands[2])> 0&& INTVAL (operands[2])<= 32)" > + [(parallel [(set (match_dup 0) > + (compare:CC_NOOV (ashift:SI (match_dup 1) (match_dup 2)) > + (const_int 0))) > + (clobber (match_dup 3))]) > + (set (pc) > + (if_then_else (match_op_dup 4 [(match_dup 0) (const_int 0)]) > + (match_dup 5) (match_dup 6)))] > + " > + operands[2] = GEN_INT (32 - INTVAL (operands[2])); > + ") > + > > On Fri, Jul 2, 2010 at 5:15 PM, Richard Earnshaw<rearnsha@arm.com> wrote: >> >> On Fri, 2010-07-02 at 08:53 +0800, Carrot Wei wrote: >>> Hi Richard >>> >>> The new peephole2 and the old pattern does different optimization. As >>> you have described the peephole2 can optimize the cases that test a >>> single bit in a word. But the old pattern tests if the bit fields at >>> the low end of a word is equal or not equal to zero, the bit field may >>> contain more than 1 bit. Interestingly the test case with the old >>> pattern can fit in both situations. If we change the test case as >>> following, it can show the regression. >>> >>> struct A >>> { >>> int v:2; >>> }; >>> >>> >>> int bar(); >>> int foo(struct A* p) >>> { >>> if (p->v) >>> return 1; >>> return bar(); >>> } >>> >>> So we need another peephole2 to bring that optimization back. >>> >>> thanks >>> Guozhi >> >> Yes, a peep2 for that should be pretty straight-forward to generate. >> Simply transform the code into a left-shift and compare with 0, then a >> branch if eq/ne. >> >> R. >> >> > >
Index: thumb2.md =================================================================== --- thumb2.md (revision 161725) +++ thumb2.md (working copy) @@ -1501,3 +1501,29 @@ VOIDmode, operands[0], const0_rtx); ") +(define_peephole2 + [(set (match_operand:CC_NOOV 0 "cc_register" "") + (compare:CC_NOOV (zero_extract:SI + (match_operand:SI 1 "low_register_operand" "") + (match_operand:SI 2 "const_int_operand" "") + (const_int 0)) + (const_int 0))) + (match_scratch:SI 3 "l") + (set (pc) + (if_then_else (match_operator:CC_NOOV 4 "equality_operator" + [(match_dup 0) (const_int 0)]) + (match_operand 5 "" "") + (match_operand 6 "" "")))] + "TARGET_THUMB2 + && (INTVAL (operands[2]) > 0 && INTVAL (operands[2]) <= 32)" + [(parallel [(set (match_dup 0) + (compare:CC_NOOV (ashift:SI (match_dup 1) (match_dup 2)) + (const_int 0))) + (clobber (match_dup 3))]) + (set (pc) + (if_then_else (match_op_dup 4 [(match_dup 0) (const_int 0)]) + (match_dup 5) (match_dup 6)))] + " + operands[2] = GEN_INT (32 - INTVAL (operands[2])); + ") +