Patchwork PATCH: PR target/49600: Bad SSE2 int->float split in i386.md

login
register
mail settings
Submitter Uros Bizjak
Date July 4, 2011, 10:18 a.m.
Message ID <CAFULd4anTGXCiZe4Vxp1qmmqA6c8eEdDA=xGGUT=ah15SLgG+w@mail.gmail.com>
Download mbox | patch
Permalink /patch/103080/
State New
Headers show

Comments

Uros Bizjak - July 4, 2011, 10:18 a.m.
On Mon, Jul 4, 2011 at 7:13 AM, H.J. Lu <hjl.tools@gmail.com> wrote:

>>>> In one SSE2 int->float split, when TARGET_USE_VECTOR_CONVERTS is true,
>>>> TARGET_INTER_UNIT_MOVES is false and GENERAL_REG_P (op1) is true. we
>>>> will get gcc_unreachable.  This patch removes TARGET_INTER_UNIT_MOVES
>>>> check.  OK for trunk?
>>>
>>> This will result in register allocation failure. Operand 0 of
>
> That particular sse2_loadld insn matches:
>
> (insn 49 22 50 5 (set (reg:V4SI 21 xmm0 [83])
>        (vec_merge:V4SI (vec_duplicate:V4SI (reg/v:SI 1 dx [orig:64
> test ] [64]))
>            (const_vector:V4SI [
>                    (const_int 0 [0])
>                    (const_int 0 [0])
>                    (const_int 0 [0])
>                    (const_int 0 [0])
>                ])
>            (const_int 1 [0x1]))) x.i:11 1365 {vec_setv4si_0}
>     (nil))
>

Yes, but it should not be generated for !TARGET_INTER_UNIT_MOVES. The
constraint should be Yi, but then we don't shadow other alternatives
correctly.

>>> sse2_loadld pattern has conditional constraint Yi that depends on
>>> TARGET_INTER_UNIT_MOVES, so we can't blindly generate sse2_loadld
>>> after reload.  I'm testing attached patch.
>>>
>>> BTW: Do you perhaps have a testcase for this problem?
>>
>> I have a testcase. But it needs a new x86 optimization we are working on it.
>>
>>> 2011-07-03  Uros Bizjak  <ubizjak@gmail.com>
>>>
>>>        PR target/49600
>>>        * config/i386/i386.md (SSE2 int->float split): Push operand 1 in
>>>        general register to memory for !TARGET_INTER_UNIT_MOVES.
>>>
>>
>> I will give it a try.
>>
>
> It doesn't work: I still got

Yes, I later noticed that I have changed the wrong pattern (the one
with memory clobber) ;( . Attached is the correct patch.

Uros.
H.J. Lu - July 4, 2011, 1:42 p.m.
On Mon, Jul 4, 2011 at 3:18 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Mon, Jul 4, 2011 at 7:13 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>
>>>>> In one SSE2 int->float split, when TARGET_USE_VECTOR_CONVERTS is true,
>>>>> TARGET_INTER_UNIT_MOVES is false and GENERAL_REG_P (op1) is true. we
>>>>> will get gcc_unreachable.  This patch removes TARGET_INTER_UNIT_MOVES
>>>>> check.  OK for trunk?
>>>>
>>>> This will result in register allocation failure. Operand 0 of
>>
>> That particular sse2_loadld insn matches:
>>
>> (insn 49 22 50 5 (set (reg:V4SI 21 xmm0 [83])
>>        (vec_merge:V4SI (vec_duplicate:V4SI (reg/v:SI 1 dx [orig:64
>> test ] [64]))
>>            (const_vector:V4SI [
>>                    (const_int 0 [0])
>>                    (const_int 0 [0])
>>                    (const_int 0 [0])
>>                    (const_int 0 [0])
>>                ])
>>            (const_int 1 [0x1]))) x.i:11 1365 {vec_setv4si_0}
>>     (nil))
>>
>
> Yes, but it should not be generated for !TARGET_INTER_UNIT_MOVES. The
> constraint should be Yi, but then we don't shadow other alternatives
> correctly.
>
>>>> sse2_loadld pattern has conditional constraint Yi that depends on
>>>> TARGET_INTER_UNIT_MOVES, so we can't blindly generate sse2_loadld
>>>> after reload.  I'm testing attached patch.
>>>>
>>>> BTW: Do you perhaps have a testcase for this problem?
>>>
>>> I have a testcase. But it needs a new x86 optimization we are working on it.
>>>
>>>> 2011-07-03  Uros Bizjak  <ubizjak@gmail.com>
>>>>
>>>>        PR target/49600
>>>>        * config/i386/i386.md (SSE2 int->float split): Push operand 1 in
>>>>        general register to memory for !TARGET_INTER_UNIT_MOVES.
>>>>
>>>
>>> I will give it a try.
>>>
>>
>> It doesn't work: I still got
>
> Yes, I later noticed that I have changed the wrong pattern (the one
> with memory clobber) ;( . Attached is the correct patch.
>

This works.  Can you check it in?

Thanks.

Patch

Index: config/i386/i386.md
===================================================================
--- config/i386/i386.md	(revision 175786)
+++ config/i386/i386.md	(working copy)
@@ -5022,11 +5022,20 @@ 
   if (GET_CODE (op1) == SUBREG)
     op1 = SUBREG_REG (op1);
 
-  if (GENERAL_REG_P (op1) && TARGET_INTER_UNIT_MOVES)
+  if (GENERAL_REG_P (op1))
     {
       operands[4] = simplify_gen_subreg (V4SImode, operands[0], <MODE>mode, 0);
-      emit_insn (gen_sse2_loadld (operands[4],
-				  CONST0_RTX (V4SImode), operands[1]));
+      if (TARGET_INTER_UNIT_MOVES)
+	emit_insn (gen_sse2_loadld (operands[4],
+				    CONST0_RTX (V4SImode), operands[1]));
+      else
+	{
+	  operands[5] = ix86_force_to_memory (GET_MODE (operands[1]),
+					      operands[1]);
+	  emit_insn (gen_sse2_loadld (operands[4],
+				      CONST0_RTX (V4SImode), operands[5]));
+	  ix86_free_from_memory (GET_MODE (operands[1]));
+	}
     }
   /* We can ignore possible trapping value in the
      high part of SSE register for non-trapping math. */