Message ID | CAFULd4anTGXCiZe4Vxp1qmmqA6c8eEdDA=xGGUT=ah15SLgG+w@mail.gmail.com |
---|---|
State | New |
Headers | show |
On Mon, Jul 4, 2011 at 3:18 AM, Uros Bizjak <ubizjak@gmail.com> wrote: > On Mon, Jul 4, 2011 at 7:13 AM, H.J. Lu <hjl.tools@gmail.com> wrote: > >>>>> In one SSE2 int->float split, when TARGET_USE_VECTOR_CONVERTS is true, >>>>> TARGET_INTER_UNIT_MOVES is false and GENERAL_REG_P (op1) is true. we >>>>> will get gcc_unreachable. This patch removes TARGET_INTER_UNIT_MOVES >>>>> check. OK for trunk? >>>> >>>> This will result in register allocation failure. Operand 0 of >> >> That particular sse2_loadld insn matches: >> >> (insn 49 22 50 5 (set (reg:V4SI 21 xmm0 [83]) >> (vec_merge:V4SI (vec_duplicate:V4SI (reg/v:SI 1 dx [orig:64 >> test ] [64])) >> (const_vector:V4SI [ >> (const_int 0 [0]) >> (const_int 0 [0]) >> (const_int 0 [0]) >> (const_int 0 [0]) >> ]) >> (const_int 1 [0x1]))) x.i:11 1365 {vec_setv4si_0} >> (nil)) >> > > Yes, but it should not be generated for !TARGET_INTER_UNIT_MOVES. The > constraint should be Yi, but then we don't shadow other alternatives > correctly. > >>>> sse2_loadld pattern has conditional constraint Yi that depends on >>>> TARGET_INTER_UNIT_MOVES, so we can't blindly generate sse2_loadld >>>> after reload. I'm testing attached patch. >>>> >>>> BTW: Do you perhaps have a testcase for this problem? >>> >>> I have a testcase. But it needs a new x86 optimization we are working on it. >>> >>>> 2011-07-03 Uros Bizjak <ubizjak@gmail.com> >>>> >>>> PR target/49600 >>>> * config/i386/i386.md (SSE2 int->float split): Push operand 1 in >>>> general register to memory for !TARGET_INTER_UNIT_MOVES. >>>> >>> >>> I will give it a try. >>> >> >> It doesn't work: I still got > > Yes, I later noticed that I have changed the wrong pattern (the one > with memory clobber) ;( . Attached is the correct patch. > This works. Can you check it in? Thanks.
Index: config/i386/i386.md =================================================================== --- config/i386/i386.md (revision 175786) +++ config/i386/i386.md (working copy) @@ -5022,11 +5022,20 @@ if (GET_CODE (op1) == SUBREG) op1 = SUBREG_REG (op1); - if (GENERAL_REG_P (op1) && TARGET_INTER_UNIT_MOVES) + if (GENERAL_REG_P (op1)) { operands[4] = simplify_gen_subreg (V4SImode, operands[0], <MODE>mode, 0); - emit_insn (gen_sse2_loadld (operands[4], - CONST0_RTX (V4SImode), operands[1])); + if (TARGET_INTER_UNIT_MOVES) + emit_insn (gen_sse2_loadld (operands[4], + CONST0_RTX (V4SImode), operands[1])); + else + { + operands[5] = ix86_force_to_memory (GET_MODE (operands[1]), + operands[1]); + emit_insn (gen_sse2_loadld (operands[4], + CONST0_RTX (V4SImode), operands[5])); + ix86_free_from_memory (GET_MODE (operands[1])); + } } /* We can ignore possible trapping value in the high part of SSE register for non-trapping math. */