Patchwork [i386] : PR target/43653: [4.3/4.4/4.5/4.6 Regression] ICE at reload1.c:1188 with -O1 -ftree-vectorize

login
register
mail settings
Submitter Uros Bizjak
Date Feb. 16, 2011, 10:39 a.m.
Message ID <AANLkTi=bdR-U6PQQsPhjtchs15pYGsRW4XOFPABJJ8H5@mail.gmail.com>
Download mbox | patch
Permalink /patch/83353/
State New
Headers show

Comments

Uros Bizjak - Feb. 16, 2011, 10:39 a.m.
Hello!

Attached patch fixes PR target/43653. The core of the problem is, that
frame pointer register leaks into SSE vector instruction. Eventually,
frame pointer gets eliminated to SP+offset, and reload chokes on
reload from SP+offset to SSE register.

The solution (as proposed by Jeff) is to help reload by specifying
GENERAL_REGs when PLUS RTX is to be reloaded.

2011-02-16  Uros Bizjak  <ubizjak@gmail.com>

	* config/i386/i386.c (ix86_secondary_reload): Handle SSE
	input reload with PLUS RTX.

testsuite/ChangeLog:

2011-02-16  Uros Bizjak  <ubizjak@gmail.com>

	* gcc.target/i386/pr43653.c: New test.

Patch was tested on x86_64-pc-linux-gnu {,-m32}. Patch will be
committed to mainline and backported to release branches.

Uros.

Patch

Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c	(revision 170208)
+++ config/i386/i386.c	(working copy)
@@ -28313,6 +28314,45 @@  ix86_secondary_reload (bool in_p, rtx x,
 	return Q_REGS;
     }
 
+  /* This condition handles corner case where an expression involving
+     pointers gets vectorized.  We're trying to use the address of a
+     stack slot as a vector initializer.  
+
+     (set (reg:V2DI 74 [ vect_cst_.2 ])
+          (vec_duplicate:V2DI (reg/f:DI 20 frame)))
+
+     Eventually frame gets turned into sp+offset like this:
+
+     (set (reg:V2DI 21 xmm0 [orig:74 vect_cst_.2 ] [74])
+          (vec_duplicate:V2DI (plus:DI (reg/f:DI 7 sp)
+	                               (const_int 392 [0x188]))))
+
+     That later gets turned into:
+
+     (set (reg:V2DI 21 xmm0 [orig:74 vect_cst_.2 ] [74])
+          (vec_duplicate:V2DI (plus:DI (reg/f:DI 7 sp)
+	    (mem/u/c/i:DI (symbol_ref/u:DI ("*.LC0") [flags 0x2]) [0 S8 A64]))))
+
+     We'll have the following reload recorded:
+
+     Reload 0: reload_in (DI) =
+           (plus:DI (reg/f:DI 7 sp)
+            (mem/u/c/i:DI (symbol_ref/u:DI ("*.LC0") [flags 0x2]) [0 S8 A64]))
+     reload_out (V2DI) = (reg:V2DI 21 xmm0 [orig:74 vect_cst_.2 ] [74])
+     SSE_REGS, RELOAD_OTHER (opnum = 0), can't combine
+     reload_in_reg: (plus:DI (reg/f:DI 7 sp) (const_int 392 [0x188]))
+     reload_out_reg: (reg:V2DI 21 xmm0 [orig:74 vect_cst_.2 ] [74])
+     reload_reg_rtx: (reg:V2DI 22 xmm1)
+
+     Which isn't going to work since SSE instructions can't handle scalar
+     additions.  Returning GENERAL_REGS forces the addition into integer
+     register and reload can handle subsequent reloads without problems.  */
+
+  if (in_p && GET_CODE (x) == PLUS
+      && SSE_CLASS_P (rclass)
+      && SCALAR_INT_MODE_P (mode))
+    return GENERAL_REGS;
+
   return NO_REGS;
 }
 
Index: testsuite/gcc.target/i386/pr43653.c
===================================================================
--- testsuite/gcc.target/i386/pr43653.c	(revision 0)
+++ testsuite/gcc.target/i386/pr43653.c	(revision 0)
@@ -0,0 +1,14 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O1 -ftree-vectorize -msse" } */
+
+typedef struct {} S;
+
+void *foo()
+{
+  S a[64], *p[64];
+  int i;
+
+  for (i = 0; i < 64; i++)
+    p[i] = &a[i];
+  return p[0];
+}