diff mbox

[i386,AVX2] Remove redundant expands.

Message ID 20131016160621.GB22220@msticlxl57.ims.intel.com
State New
Headers show

Commit Message

Kirill Yukhin Oct. 16, 2013, 4:06 p.m. UTC
Hello,

It seems that gang of AVX* patterns were copy and
pasted from SSE, however as far as they are NDD,
we may remove corresponding expands which sort operands.

ChangeLog:
	* config/i386/sse.md (vec_widen_umult_even_v8si): Remove expand,
	make insn visible, remove redundant check.
	(vec_widen_smult_even_v8si): Ditto.
	(avx2_pmaddwd): Ditto.
	(avx2_eq<mode>3): Ditto.
	(avx512f_eq<mode>3): Ditto.

Bootrstrap pass.
All AVX* tests pass.

Is it ok to commit to main trunk?

--
Thanks, K

---
 gcc/ChangeLog          |   9 ++++
 gcc/config/i386/sse.md | 119 ++++++-------------------------------------------
 2 files changed, 23 insertions(+), 105 deletions(-)

Comments

Uros Bizjak Oct. 16, 2013, 4:21 p.m. UTC | #1
On Wed, Oct 16, 2013 at 6:06 PM, Kirill Yukhin <kirill.yukhin@gmail.com> wrote:

> It seems that gang of AVX* patterns were copy and
> pasted from SSE, however as far as they are NDD,
> we may remove corresponding expands which sort operands.
>
> ChangeLog:
>         * config/i386/sse.md (vec_widen_umult_even_v8si): Remove expand,
>         make insn visible, remove redundant check.
>         (vec_widen_smult_even_v8si): Ditto.
>         (avx2_pmaddwd): Ditto.
>         (avx2_eq<mode>3): Ditto.
>         (avx512f_eq<mode>3): Ditto.
>

> -(define_insn "*vec_widen_smult_even_v8si"
> +(define_insn "vec_widen_smult_even_v8si"
>    [(set (match_operand:V4DI 0 "register_operand" "=x")
>         (mult:V4DI
>           (sign_extend:V4DI
>             (vec_select:V4SI
> -             (match_operand:V8SI 1 "nonimmediate_operand" "x")
> +             (match_operand:V8SI 1 "nonimmediate_operand" "%x")
>               (parallel [(const_int 0) (const_int 2)
>                          (const_int 4) (const_int 6)])))
>           (sign_extend:V4DI
> @@ -6166,7 +6134,7 @@
>               (match_operand:V8SI 2 "nonimmediate_operand" "xm")
>               (parallel [(const_int 0) (const_int 2)
>                          (const_int 4) (const_int 6)])))))]
> -  "TARGET_AVX2 && ix86_binary_operator_ok (MULT, V8SImode, operands)"
> +  "TARGET_AVX2"
>    "vpmuldq\t{%2, %1, %0|%0, %1, %2}"
>    [(set_attr "isa" "avx")

Please also remove the above set_attr, it is not needed when the insn
is constrainted with TARGET_AVX2.

OK with this addition.

Thanks,
Uros.
Uros Bizjak Oct. 16, 2013, 4:47 p.m. UTC | #2
On Wed, Oct 16, 2013 at 6:06 PM, Kirill Yukhin <kirill.yukhin@gmail.com> wrote:

> It seems that gang of AVX* patterns were copy and
> pasted from SSE, however as far as they are NDD,
> we may remove corresponding expands which sort operands.

OTOH, I have some second thoughts on removing AVX2 expanders.

Please consider the situation, where we have *both* operands in
memory, and the insn is inside the loop. When reload comes around, it
will fixup one of the operands with a load from memory. However,
having insn in the loop, I suspect the load won't be moved out of
loop.

So, I guess even AVX/AVX2 insn patterns should call
ix86_fixup_binary_operands_*, and this fixup function should be
improved to load one of the operands into register, in case both
operands are in memory.

This also means, that you still need expanders for AVX512 commutative
multiplies.

Uros.
Richard Henderson Oct. 16, 2013, 5:01 p.m. UTC | #3
On 10/16/2013 09:47 AM, Uros Bizjak wrote:
> On Wed, Oct 16, 2013 at 6:06 PM, Kirill Yukhin <kirill.yukhin@gmail.com> wrote:
> 
>> It seems that gang of AVX* patterns were copy and
>> pasted from SSE, however as far as they are NDD,
>> we may remove corresponding expands which sort operands.
> 
> OTOH, I have some second thoughts on removing AVX2 expanders.
> 
> Please consider the situation, where we have *both* operands in
> memory, and the insn is inside the loop. When reload comes around, it
> will fixup one of the operands with a load from memory. However,
> having insn in the loop, I suspect the load won't be moved out of
> loop.
> 
> So, I guess even AVX/AVX2 insn patterns should call
> ix86_fixup_binary_operands_*, and this fixup function should be
> improved to load one of the operands into register, in case both
> operands are in memory.
> 
> This also means, that you still need expanders for AVX512 commutative
> multiplies.
> 
> Uros.
> 

Fair enough.


r~
Uros Bizjak Oct. 16, 2013, 6:42 p.m. UTC | #4
On Wed, Oct 16, 2013 at 7:01 PM, Richard Henderson <rth@redhat.com> wrote:
> On 10/16/2013 09:47 AM, Uros Bizjak wrote:
>> On Wed, Oct 16, 2013 at 6:06 PM, Kirill Yukhin <kirill.yukhin@gmail.com> wrote:
>>
>>> It seems that gang of AVX* patterns were copy and
>>> pasted from SSE, however as far as they are NDD,
>>> we may remove corresponding expands which sort operands.
>>
>> OTOH, I have some second thoughts on removing AVX2 expanders.
>>
>> Please consider the situation, where we have *both* operands in
>> memory, and the insn is inside the loop. When reload comes around, it
>> will fixup one of the operands with a load from memory. However,
>> having insn in the loop, I suspect the load won't be moved out of
>> loop.
>>
>> So, I guess even AVX/AVX2 insn patterns should call
>> ix86_fixup_binary_operands_*, and this fixup function should be
>> improved to load one of the operands into register, in case both
>> operands are in memory.
>>
>> This also means, that you still need expanders for AVX512 commutative
>> multiplies.
>
> Fair enough.

I have checked ix86_fixup_binary_operands and ix86_binary_operator_ok,
and they should be OK for destructive (SSE2) and non-destructive (AVX)
commutative instructions. Vector instructions have always destination
in a register, so most of fixups and checks do not apply at all. We
can probably use:

{
  if (MEM_P (operands[1]) && MEM_P (operands[2]))
    operands[1] = force_reg (<MODE>mode, operands[1]);
}

in expanders and

!(MEM_P (operands[0]) && MEM_P (operands[1]))

in insn constraints for most of SSE and AVX commutative insns.

Uros.
diff mbox

Patch

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index ebaa3e0..b25d8eb 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,12 @@ 
+2013-10-16  Kirill Yukhin  <kirill.yukhin@intel.com>
+
+	* config/i386/sse.md (vec_widen_umult_even_v8si): Remove expand,
+	make insn visible, remove redundant check.
+	(vec_widen_smult_even_v8si): Ditto.
+	(avx2_pmaddwd): Ditto.
+	(avx2_eq<mode>3): Ditto.
+	(avx512f_eq<mode>3): Ditto.
+
 2013-10-16  Yvan Roux  <yvan.roux@linaro.org>
 
 	* config/arm/arm.opt (mlra): New option.
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 2046dd5..57e4c2b 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -6067,23 +6067,7 @@ 
    (set_attr "prefix" "orig,vex")
    (set_attr "mode" "<sseinsnmode>")])
 
-(define_expand "vec_widen_umult_even_v8si"
-  [(set (match_operand:V4DI 0 "register_operand")
-	(mult:V4DI
-	  (zero_extend:V4DI
-	    (vec_select:V4SI
-	      (match_operand:V8SI 1 "nonimmediate_operand")
-	      (parallel [(const_int 0) (const_int 2)
-			 (const_int 4) (const_int 6)])))
-	  (zero_extend:V4DI
-	    (vec_select:V4SI
-	      (match_operand:V8SI 2 "nonimmediate_operand")
-	      (parallel [(const_int 0) (const_int 2)
-			 (const_int 4) (const_int 6)])))))]
-  "TARGET_AVX2"
-  "ix86_fixup_binary_operands_no_copy (MULT, V8SImode, operands);")
-
-(define_insn "*vec_widen_umult_even_v8si"
+(define_insn "vec_widen_umult_even_v8si"
   [(set (match_operand:V4DI 0 "register_operand" "=x")
 	(mult:V4DI
 	  (zero_extend:V4DI
@@ -6096,7 +6080,7 @@ 
 	      (match_operand:V8SI 2 "nonimmediate_operand" "xm")
 	      (parallel [(const_int 0) (const_int 2)
 			 (const_int 4) (const_int 6)])))))]
-  "TARGET_AVX2 && ix86_binary_operator_ok (MULT, V8SImode, operands)"
+  "TARGET_AVX2"
   "vpmuludq\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "sseimul")
    (set_attr "prefix" "vex")
@@ -6137,28 +6121,12 @@ 
    (set_attr "prefix" "orig,vex")
    (set_attr "mode" "TI")])
 
-(define_expand "vec_widen_smult_even_v8si"
-  [(set (match_operand:V4DI 0 "register_operand")
-	(mult:V4DI
-	  (sign_extend:V4DI
-	    (vec_select:V4SI
-	      (match_operand:V8SI 1 "nonimmediate_operand")
-	      (parallel [(const_int 0) (const_int 2)
-			 (const_int 4) (const_int 6)])))
-	  (sign_extend:V4DI
-	    (vec_select:V4SI
-	      (match_operand:V8SI 2 "nonimmediate_operand")
-	      (parallel [(const_int 0) (const_int 2)
-			 (const_int 4) (const_int 6)])))))]
-  "TARGET_AVX2"
-  "ix86_fixup_binary_operands_no_copy (MULT, V8SImode, operands);")
-
-(define_insn "*vec_widen_smult_even_v8si"
+(define_insn "vec_widen_smult_even_v8si"
   [(set (match_operand:V4DI 0 "register_operand" "=x")
 	(mult:V4DI
 	  (sign_extend:V4DI
 	    (vec_select:V4SI
-	      (match_operand:V8SI 1 "nonimmediate_operand" "x")
+	      (match_operand:V8SI 1 "nonimmediate_operand" "%x")
 	      (parallel [(const_int 0) (const_int 2)
 			 (const_int 4) (const_int 6)])))
 	  (sign_extend:V4DI
@@ -6166,7 +6134,7 @@ 
 	      (match_operand:V8SI 2 "nonimmediate_operand" "xm")
 	      (parallel [(const_int 0) (const_int 2)
 			 (const_int 4) (const_int 6)])))))]
-  "TARGET_AVX2 && ix86_binary_operator_ok (MULT, V8SImode, operands)"
+  "TARGET_AVX2"
   "vpmuldq\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "isa" "avx")
    (set_attr "type" "sseimul")
@@ -6210,41 +6178,7 @@ 
    (set_attr "prefix" "orig,vex")
    (set_attr "mode" "TI")])
 
-(define_expand "avx2_pmaddwd"
-  [(set (match_operand:V8SI 0 "register_operand")
-	(plus:V8SI
-	  (mult:V8SI
-	    (sign_extend:V8SI
-	      (vec_select:V8HI
-		(match_operand:V16HI 1 "nonimmediate_operand")
-		(parallel [(const_int 0) (const_int 2)
-			   (const_int 4) (const_int 6)
-			   (const_int 8) (const_int 10)
-			   (const_int 12) (const_int 14)])))
-	    (sign_extend:V8SI
-	      (vec_select:V8HI
-		(match_operand:V16HI 2 "nonimmediate_operand")
-		(parallel [(const_int 0) (const_int 2)
-			   (const_int 4) (const_int 6)
-			   (const_int 8) (const_int 10)
-			   (const_int 12) (const_int 14)]))))
-	  (mult:V8SI
-	    (sign_extend:V8SI
-	      (vec_select:V8HI (match_dup 1)
-		(parallel [(const_int 1) (const_int 3)
-			   (const_int 5) (const_int 7)
-			   (const_int 9) (const_int 11)
-			   (const_int 13) (const_int 15)])))
-	    (sign_extend:V8SI
-	      (vec_select:V8HI (match_dup 2)
-		(parallel [(const_int 1) (const_int 3)
-			   (const_int 5) (const_int 7)
-			   (const_int 9) (const_int 11)
-			   (const_int 13) (const_int 15)]))))))]
-  "TARGET_AVX2"
-  "ix86_fixup_binary_operands_no_copy (MULT, V16HImode, operands);")
-
-(define_insn "*avx2_pmaddwd"
+(define_insn "avx2_pmaddwd"
   [(set (match_operand:V8SI 0 "register_operand" "=x")
 	(plus:V8SI
 	  (mult:V8SI
@@ -6275,7 +6209,7 @@ 
 			   (const_int 5) (const_int 7)
 			   (const_int 9) (const_int 11)
 			   (const_int 13) (const_int 15)]))))))]
-  "TARGET_AVX2 && ix86_binary_operator_ok (MULT, V16HImode, operands)"
+  "TARGET_AVX2"
   "vpmaddwd\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "sseiadd")
    (set_attr "prefix" "vex")
@@ -6618,20 +6552,12 @@ 
   [(set_attr "prefix" "evex")
    (set_attr "mode" "<sseinsnmode>")])
 
-(define_expand "<code><mode>3"
-  [(set (match_operand:VI124_256_48_512 0 "register_operand")
-	(maxmin:VI124_256_48_512
-	  (match_operand:VI124_256_48_512 1 "nonimmediate_operand")
-	  (match_operand:VI124_256_48_512 2 "nonimmediate_operand")))]
-  "TARGET_AVX2"
-  "ix86_fixup_binary_operands_no_copy (<CODE>, <MODE>mode, operands);")
-
-(define_insn "*avx2_<code><mode>3"
+(define_insn "<code><mode>3"
   [(set (match_operand:VI124_256_48_512 0 "register_operand" "=v")
 	(maxmin:VI124_256_48_512
 	  (match_operand:VI124_256_48_512 1 "nonimmediate_operand" "%v")
 	  (match_operand:VI124_256_48_512 2 "nonimmediate_operand" "vm")))]
-  "TARGET_AVX2 && ix86_binary_operator_ok (<CODE>, <MODE>mode, operands)"
+  "TARGET_AVX2"
   "vp<maxmin_int><ssemodesuffix>\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "sseiadd")
    (set_attr "prefix_extra" "1")
@@ -6830,42 +6756,25 @@ 
 ;;
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 
-(define_expand "avx2_eq<mode>3"
-  [(set (match_operand:VI_256 0 "register_operand")
-	(eq:VI_256
-	  (match_operand:VI_256 1 "nonimmediate_operand")
-	  (match_operand:VI_256 2 "nonimmediate_operand")))]
-  "TARGET_AVX2"
-  "ix86_fixup_binary_operands_no_copy (EQ, <MODE>mode, operands);")
-
-(define_insn "*avx2_eq<mode>3"
+(define_insn "avx2_eq<mode>3"
   [(set (match_operand:VI_256 0 "register_operand" "=x")
 	(eq:VI_256
 	  (match_operand:VI_256 1 "nonimmediate_operand" "%x")
 	  (match_operand:VI_256 2 "nonimmediate_operand" "xm")))]
-  "TARGET_AVX2 && ix86_binary_operator_ok (EQ, <MODE>mode, operands)"
+  "TARGET_AVX2"
   "vpcmpeq<ssemodesuffix>\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "ssecmp")
    (set_attr "prefix_extra" "1")
    (set_attr "prefix" "vex")
    (set_attr "mode" "OI")])
 
-(define_expand "avx512f_eq<mode>3"
-  [(set (match_operand:<avx512fmaskmode> 0 "register_operand")
-	(unspec:<avx512fmaskmode>
-	  [(match_operand:VI48_512 1 "register_operand")
-	   (match_operand:VI48_512 2 "nonimmediate_operand")]
-	  UNSPEC_MASKED_EQ))]
-  "TARGET_AVX512F"
-  "ix86_fixup_binary_operands_no_copy (EQ, <MODE>mode, operands);")
-
-(define_insn "avx512f_eq<mode>3_1"
+(define_insn "avx512f_eq<mode>3"
   [(set (match_operand:<avx512fmaskmode> 0 "register_operand" "=k")
 	(unspec:<avx512fmaskmode>
-	  [(match_operand:VI48_512 1 "register_operand" "%v")
+	  [(match_operand:VI48_512 1 "nonimmediate_operand" "%v")
 	   (match_operand:VI48_512 2 "nonimmediate_operand" "vm")]
 	  UNSPEC_MASKED_EQ))]
-  "TARGET_AVX512F && ix86_binary_operator_ok (EQ, <MODE>mode, operands)"
+  "TARGET_AVX512F"
   "vpcmpeq<ssemodesuffix>\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "ssecmp")
    (set_attr "prefix_extra" "1")