diff mbox

Fix ICE with xmm{16-31} in *truncdfsf_fast_mixed with -mtune=barcelona (PR target/70086)

Message ID 20160305063954.GO3017@tucnak.redhat.com
State New
Headers show

Commit Message

Jakub Jelinek March 5, 2016, 6:39 a.m. UTC
Hi!

The r222470 commit changed =x into =v constraint in *truncdfsf_fast_mixed.
The problem is that for some tunings we have a splitter
/* For converting DF(xmm2) to SF(xmm1), use the following code instead of
   cvtsd2ss:
      unpcklpd xmm2,xmm2   ; packed conversion might crash on signaling NaNs
      cvtpd2ps xmm2,xmm1
If the input operand is memory, it attempts to emit sse2_loadlpd
instruction.  But, that define_insn doesn't have any v constraints and so we
fail to recognize it.  For the vmovsd 2 operand m -> v instruction
*vec_concatv2df implements that too.
So I see 3 options for this:
1) as the patch does, emit *vec_concatv2df manually
2) rename *vec_concatv2df to vec_concatv2df and use gen_vec_concatv2df
   in the splitter; possibly use it instead of sse2_loadlpd there, because
   that insn has uglier/more complex pattern
3) tweak sse2_loadlpd - add various v alternatives to it, guard them with
   avx512vl isa, etc.

I bet the 3) treatment is desirable and likely many other instructions need
it, but that doesn't sound like stage4 material to me, I find it quite
risky, do you agree?  If yes, the following patch can work temporarily
(bootstrapped/regtested on x86_64-linux and i686-linux), or I can do 2),
but in that case I'd like to know your preferences about the suboption
(whether to replace gen_sse2_loadlpd with gen_vec_concatv2df or whether
to use it only for the EXT_REX_SSE_REG_P regs).

2016-03-04  Jakub Jelinek  <jakub@redhat.com>

	PR target/70086
	* config/i386/i386.md (truncdfsf2 splitter): Handle
	EXT_REX_SSE_REG_P destination with memory input.

	* gcc.target/i386/pr70086-1.c: New test.
	* gcc.target/i386/pr70086-2.c: New test.


	Jakub
diff mbox

Patch

--- gcc/config/i386/i386.md.jj	2016-03-02 14:09:50.000000000 +0100
+++ gcc/config/i386/i386.md	2016-03-04 22:56:32.206840674 +0100
@@ -4392,6 +4392,11 @@  (define_split
 	operands[4] = simplify_gen_subreg (V2DFmode, operands[1], DFmode, 0);
       emit_insn (gen_vec_dupv2df (operands[4], operands[1]));
     }
+  else if (EXT_REX_SSE_REG_P (operands[4]))
+    /* Emit *vec_concatv2df.  */
+    emit_insn (gen_rtx_SET (operands[4],
+			    gen_rtx_VEC_CONCAT (V2DFmode, operands[1],
+						CONST0_RTX (DFmode))));
   else
     emit_insn (gen_sse2_loadlpd (operands[4],
 				 CONST0_RTX (V2DFmode), operands[1]));
--- gcc/testsuite/gcc.target/i386/pr70086-1.c.jj	2016-03-04 23:01:07.447081169 +0100
+++ gcc/testsuite/gcc.target/i386/pr70086-1.c	2016-03-04 23:00:27.000000000 +0100
@@ -0,0 +1,11 @@ 
+/* PR target/70086 */
+/* { dg-do compile } */
+/* { dg-options "-mtune=barcelona -mavx512vl -ffloat-store" } */
+
+float
+foo (float a, float b, double c, float d, double e, float f)
+{
+  e -= d;
+  d *= e;
+  return e + d;
+}
--- gcc/testsuite/gcc.target/i386/pr70086-2.c.jj	2016-03-04 23:01:07.447081169 +0100
+++ gcc/testsuite/gcc.target/i386/pr70086-2.c	2016-03-04 23:00:27.000000000 +0100
@@ -0,0 +1,12 @@ 
+/* PR target/70086 */
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -mtune=barcelona -mavx512vl" } */
+
+float
+foo (double *p)
+{
+  register float xmm16 __asm ("xmm16");
+  xmm16 = *p;
+  asm volatile ("" : "+v" (xmm16));
+  return xmm16;
+}