Message ID | CACysShgA6gGiW7cJg4ZPHkX_mR1eeqe5AkKhRPh=sc4VKvLAiQ@mail.gmail.com |
---|---|
State | New |
Headers | show |
On Mon, May 21, 2012 at 5:39 PM, Alexander Ivchenko <aivchenk@gmail.com> wrote: > Changelog entry: > > 2012-05-21 Alexander Ivchenko <alexander.ivchenko@intel.com> > > PR target/53435 > * config/i386/i386.c (ix86_expand_vec_perm): Use correct op. > (ix86_expand_vec_perm): Use int mode instead of float. > (expand_vec_perm_pshufb): Remove handling of useseless type > conversion. > > Patch attached. Bootstrap passes. > > OK for trunk? OK for trunk and 4.7, under assumption that the relevant tests from the testsuite don't fail anymore on AVX2 target. Thanks, Uros.
On Mon, May 21, 2012 at 8:51 AM, Uros Bizjak <ubizjak@gmail.com> wrote: > On Mon, May 21, 2012 at 5:39 PM, Alexander Ivchenko <aivchenk@gmail.com> wrote: > >> Changelog entry: >> >> 2012-05-21 Alexander Ivchenko <alexander.ivchenko@intel.com> >> >> PR target/53435 >> * config/i386/i386.c (ix86_expand_vec_perm): Use correct op. >> (ix86_expand_vec_perm): Use int mode instead of float. >> (expand_vec_perm_pshufb): Remove handling of useseless type >> conversion. >> >> Patch attached. Bootstrap passes. >> >> OK for trunk? > > OK for trunk and 4.7, under assumption that the relevant tests from > the testsuite don't fail anymore on AVX2 target. > We noticed the issue when GCC is configured with --with-arch=native. But current GCC trunk won't bootstrap on AVX target when GCC is configured with --with-arch=native: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53373 We will test it on 4.7. Thanks.
diff --git gcc/config/i386/i386.c gcc/config/i386/i386.c index f09b2bb..b4ad6ed 100644 --- gcc/config/i386/i386.c +++ gcc/config/i386/i386.c @@ -19956,7 +19956,7 @@ ix86_expand_vec_perm (rtx operands[]) t1 = gen_reg_rtx (V8SImode); t2 = gen_reg_rtx (V8SImode); emit_insn (gen_avx2_permvarv8si (t1, op0, mask)); - emit_insn (gen_avx2_permvarv8si (t2, op0, mask)); + emit_insn (gen_avx2_permvarv8si (t2, op1, mask)); goto merge_two; } return; @@ -19989,10 +19989,10 @@ ix86_expand_vec_perm (rtx operands[]) case V4SFmode: t1 = gen_reg_rtx (V8SFmode); - t2 = gen_reg_rtx (V8SFmode); - mask = gen_lowpart (V4SFmode, mask); + t2 = gen_reg_rtx (V8SImode); + mask = gen_lowpart (V4SImode, mask); emit_insn (gen_avx_vec_concatv8sf (t1, op0, op1)); - emit_insn (gen_avx_vec_concatv8sf (t2, mask, mask)); + emit_insn (gen_avx_vec_concatv8si (t2, mask, mask)); emit_insn (gen_avx2_permvarv8sf (t1, t1, t2)); emit_insn (gen_avx_vextractf128v8sf (target, t1, const0_rtx)); return; @@ -36508,12 +36508,6 @@ expand_vec_perm_pshufb (struct expand_vec_perm_d *d) gen_rtvec_v (GET_MODE_NUNITS (vmode), rperm)); vperm = force_reg (vmode, vperm); - if (vmode == V8SImode && d->vmode == V8SFmode) - { - vmode = V8SFmode; - vperm = gen_lowpart (vmode, vperm); - } - target = gen_lowpart (vmode, d->target); op0 = gen_lowpart (vmode, d->op0); if (d->one_operand_p)