From patchwork Fri Aug 20 18:33:59 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: Optimize nested SIGN_EXTENDs/ZERO_EXTENDs (PR target/45336) Date: Fri, 20 Aug 2010 08:33:59 -0000 From: Jakub Jelinek X-Patchwork-Id: 62301 Message-Id: <20100820183359.GH702@tyan-ft48-01.lab.bos.redhat.com> To: Paolo Bonzini , "H.J. Lu" Cc: Bernd Schmidt , gcc-patches@gcc.gnu.org On Fri, Aug 20, 2010 at 08:00:33PM +0200, Paolo Bonzini wrote: > On 08/20/2010 07:27 PM, Jakub Jelinek wrote: > >Not sure what exactly is > >pextrb ..., %ecx > >insn doing to the upper 32 bits of %rcx, if it clears them > > Probably yes like every other 32-bit writeback on x86_64. The manuals confirm that. Following seems to work just fine in the quick testing I've done so far: 2010-08-20 Jakub Jelinek * config/i386/sse.md (*sse4_1_pextrb): Add SWI48 mode iterator to cover zero extension into 64-bit register. (*sse2_pextrw): Likewise. (*sse4_1_pextrd_zext): New insn. Jakub --- gcc/config/i386/sse.md.jj 2010-08-11 21:08:03.000000000 +0200 +++ gcc/config/i386/sse.md 2010-08-20 20:24:08.000000000 +0200 @@ -7075,14 +7075,14 @@ (define_insn "*sse4_1_pinsrq" (set_attr "length_immediate" "1") (set_attr "mode" "TI")]) -(define_insn "*sse4_1_pextrb" - [(set (match_operand:SI 0 "register_operand" "=r") - (zero_extend:SI +(define_insn "*sse4_1_pextrb_" + [(set (match_operand:SWI48 0 "register_operand" "=r") + (zero_extend:SWI48 (vec_select:QI (match_operand:V16QI 1 "register_operand" "x") (parallel [(match_operand:SI 2 "const_0_to_15_operand" "n")]))))] "TARGET_SSE4_1" - "%vpextrb\t{%2, %1, %0|%0, %1, %2}" + "%vpextrb\t{%2, %1, %k0|%k0, %1, %2}" [(set_attr "type" "sselog") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") @@ -7102,14 +7102,14 @@ (define_insn "*sse4_1_pextrb_memory" (set_attr "prefix" "maybe_vex") (set_attr "mode" "TI")]) -(define_insn "*sse2_pextrw" - [(set (match_operand:SI 0 "register_operand" "=r") - (zero_extend:SI +(define_insn "*sse2_pextrw_" + [(set (match_operand:SWI48 0 "register_operand" "=r") + (zero_extend:SWI48 (vec_select:HI (match_operand:V8HI 1 "register_operand" "x") (parallel [(match_operand:SI 2 "const_0_to_7_operand" "n")]))))] "TARGET_SSE2" - "%vpextrw\t{%2, %1, %0|%0, %1, %2}" + "%vpextrw\t{%2, %1, %k0|%k0, %1, %2}" [(set_attr "type" "sselog") (set_attr "prefix_data16" "1") (set_attr "length_immediate" "1") @@ -7142,6 +7142,20 @@ (define_insn "*sse4_1_pextrd" (set_attr "prefix" "maybe_vex") (set_attr "mode" "TI")]) +(define_insn "*sse4_1_pextrd_zext" + [(set (match_operand:DI 0 "register_operand" "=r") + (zero_extend:DI + (vec_select:SI + (match_operand:V4SI 1 "register_operand" "x") + (parallel [(match_operand:SI 2 "const_0_to_3_operand" "n")]))))] + "TARGET_64BIT && TARGET_SSE4_1" + "%vpextrd\t{%2, %1, %k0|%k0, %1, %2}" + [(set_attr "type" "sselog") + (set_attr "prefix_extra" "1") + (set_attr "length_immediate" "1") + (set_attr "prefix" "maybe_vex") + (set_attr "mode" "TI")]) + ;; It must come before *vec_extractv2di_1_sse since it is preferred. (define_insn "*sse4_1_pextrq" [(set (match_operand:DI 0 "nonimmediate_operand" "=rm")