diff mbox

Fix code quality regression caused by my gather vectorization patch

Message ID 20120102132952.GM18937@tyan-ft48-01.lab.bos.redhat.com
State New
Headers show

Commit Message

Jakub Jelinek Jan. 2, 2012, 1:29 p.m. UTC
Hi!

I've noticed that my gather vectorization patch unfortunately regressed code
quality of
gcc.target/i386/avx2-i64gatherd256-2.c
gcc.target/i386/avx2-i64gatherd256-3.c
gcc.target/i386/avx2-i64gatherd256-4.c
gcc.target/i386/avx2-i64gatherps256-3.c
gcc.target/i386/avx2-i64gatherps256-4.c
tests.  The problem is that after the unification of the gather
auto-vectorization and gather intrinsics nothing optimizes well the
new vec_select of the first half of gather pattern's result, while the
vec_select is a nop, register allocation often chooses to allocate the
gather pattern result in a different vector register from the result of
the following extraction of first half of it.
This patch fixes the regression by adding two patterns for combiner.
On some of the above tests it saves 2 instructions, one others one.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2012-01-02  Jakub Jelinek  <jakub@redhat.com>

	* config/i386/sse.md (*avx2_gatherdi<mode>_3, *avx2_gatherdi<mode>_4):
	New patterns.


	Jakub

Comments

Uros Bizjak Jan. 2, 2012, 1:47 p.m. UTC | #1
On Mon, Jan 2, 2012 at 2:29 PM, Jakub Jelinek <jakub@redhat.com> wrote:

> I've noticed that my gather vectorization patch unfortunately regressed code
> quality of
> gcc.target/i386/avx2-i64gatherd256-2.c
> gcc.target/i386/avx2-i64gatherd256-3.c
> gcc.target/i386/avx2-i64gatherd256-4.c
> gcc.target/i386/avx2-i64gatherps256-3.c
> gcc.target/i386/avx2-i64gatherps256-4.c
> tests.  The problem is that after the unification of the gather
> auto-vectorization and gather intrinsics nothing optimizes well the
> new vec_select of the first half of gather pattern's result, while the
> vec_select is a nop, register allocation often chooses to allocate the
> gather pattern result in a different vector register from the result of
> the following extraction of first half of it.
> This patch fixes the regression by adding two patterns for combiner.
> On some of the above tests it saves 2 instructions, one others one.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2012-01-02  Jakub Jelinek  <jakub@redhat.com>
>
>        * config/i386/sse.md (*avx2_gatherdi<mode>_3, *avx2_gatherdi<mode>_4):
>        New patterns.

OK.

Thanks,
Uros.
diff mbox

Patch

--- gcc/config/i386/sse.md.jj	2011-11-07 20:32:09.000000000 +0100
+++ gcc/config/i386/sse.md	2011-11-07 20:52:54.000000000 +0100
@@ -12652,3 +12652,49 @@  (define_insn "*avx2_gatherdi<mode>_2"
   [(set_attr "type" "ssemov")
    (set_attr "prefix" "vex")
    (set_attr "mode" "<sseinsnmode>")])
+
+(define_insn "*avx2_gatherdi<mode>_3"
+  [(set (match_operand:<VEC_GATHER_SRCDI> 0 "register_operand" "=&x")
+	(vec_select:<VEC_GATHER_SRCDI>
+	  (unspec:VI4F_256
+	    [(match_operand:<VEC_GATHER_SRCDI> 2 "register_operand" "0")
+	     (match_operator:<ssescalarmode> 7 "vsib_mem_operator"
+	       [(unspec:P
+		  [(match_operand:P 3 "vsib_address_operand" "p")
+		   (match_operand:<VEC_GATHER_IDXDI> 4 "register_operand" "x")
+		   (match_operand:SI 6 "const1248_operand" "n")]
+		  UNSPEC_VSIBADDR)])
+	     (mem:BLK (scratch))
+	     (match_operand:<VEC_GATHER_SRCDI> 5 "register_operand" "1")]
+	     UNSPEC_GATHER)
+	  (parallel [(const_int 0) (const_int 1)
+		     (const_int 2) (const_int 3)])))
+   (clobber (match_scratch:VI4F_256 1 "=&x"))]
+  "TARGET_AVX2"
+  "v<sseintprefix>gatherq<ssemodesuffix>\t{%5, %7, %0|%0, %7, %5}"
+  [(set_attr "type" "ssemov")
+   (set_attr "prefix" "vex")
+   (set_attr "mode" "<sseinsnmode>")])
+
+(define_insn "*avx2_gatherdi<mode>_4"
+  [(set (match_operand:<VEC_GATHER_SRCDI> 0 "register_operand" "=&x")
+	(vec_select:<VEC_GATHER_SRCDI>
+	  (unspec:VI4F_256
+	    [(pc)
+	     (match_operator:<ssescalarmode> 6 "vsib_mem_operator"
+	       [(unspec:P
+		  [(match_operand:P 2 "vsib_address_operand" "p")
+		   (match_operand:<VEC_GATHER_IDXDI> 3 "register_operand" "x")
+		   (match_operand:SI 5 "const1248_operand" "n")]
+		  UNSPEC_VSIBADDR)])
+	     (mem:BLK (scratch))
+	     (match_operand:<VEC_GATHER_SRCDI> 4 "register_operand" "1")]
+	    UNSPEC_GATHER)
+	  (parallel [(const_int 0) (const_int 1)
+		     (const_int 2) (const_int 3)])))
+   (clobber (match_scratch:VI4F_256 1 "=&x"))]
+  "TARGET_AVX2"
+  "v<sseintprefix>gatherq<ssemodesuffix>\t{%4, %6, %0|%0, %6, %4}"
+  [(set_attr "type" "ssemov")
+   (set_attr "prefix" "vex")
+   (set_attr "mode" "<sseinsnmode>")])