Patchwork Optimize nested SIGN_EXTENDs/ZERO_EXTENDs (PR target/45336)

login
register
mail settings
Submitter Jakub Jelinek
Date Aug. 20, 2010, 1:50 p.m.
Message ID <20100820135046.GC702@tyan-ft48-01.lab.bos.redhat.com>
Download mbox | patch
Permalink /patch/62279/
State New
Headers show

Comments

Jakub Jelinek - Aug. 20, 2010, 1:50 p.m.
On Fri, Aug 20, 2010 at 11:46:48AM +0200, Paolo Bonzini wrote:
> On 08/19/2010 06:33 PM, Jakub Jelinek wrote:
> >+      /* (sign_extend:M (sign_extend:N<X>)) is (sign_extend:M<X>).  */
> >+      if (GET_CODE (op) == SIGN_EXTEND)
> >+	return simplify_gen_unary (SIGN_EXTEND, mode, XEXP (op, 0),
> >+				   GET_MODE (XEXP (op, 0)));
> 
> Maybe
> 
> /* (sign_extend:M (sign_extend:N<X>)) is (sign_extend:M<X>).
>    (sign_extend:M (zero_extend:N<X>)) is (zero_extend:M<X>).  */
> if (GET_CODE (op) == SIGN_EXTEND || GET_CODE (op) == ZERO_EXTEND)
>   {
>     gcc_assert (GET_MODE (op) != mode);
>     return simplify_gen_unary (GET_CODE (op), mode, XEXP (op, 0),
>                                GET_MODE (XEXP (op, 0)));
>   }
> 
> ?

Good idea.

On Thu, Aug 19, 2010 at 07:58:24PM +0200, Bernd Schmidt wrote:
> > ZERO_EXTEND is expanded into AND instead of LSHIFTRT/ASHIFT by combine,
> > and
> > two nested ANDs should be easily optimized.  If you really want, I can
> > handle the ZERO_EXTEND (AND <X> mask) case as a follow-up.
> 
> Oh, in expand_compound_operation?  I was thinking about finding such
> shifts in a pair of insns (happens on Thumb).  Also, what about callers
> other than combine?

Ok.  The following patch thus adds optimization of sign_extend of zero_extend,
sign_extend of lshiftrt/ashift and zero_extend of lshiftrt/ashift.

Bootstrapped/regtested on x86_64-linux and i686-linux (yes,rtl checking)
with additional patchlet to gather statistics and in both
bootstraps/regtests together this optimization happened (the fact that
simplify-rtx.c simplified it doesn't still mean the result was usable in
whatever pass asked for it)	times:
(sign_extend (sign_extend))	 2059
(sign_extend (zero_extend))	87019
(sign_extend (ashiftrt))	 1485
(zero_extend (zero_extend))	15648
lshiftrt wasn't seen, but it is very well possible it shows up on other
targets.  Ok for trunk?

2010-08-20  Jakub Jelinek  <jakub@redhat.com>
	    Paolo Bonzini  <bonzini@gnu.org>

	* simplify-rtx.c (simplify_unary_operation_1): Optimize
	(sign_extend (zero_extend ()) and
	({sign,zero}_extend (lshiftrt (ashift X (const_int I)) (const_int I))).



	Jakub
Richard Henderson - Aug. 20, 2010, 4:07 p.m.
On 08/20/2010 06:50 AM, Jakub Jelinek wrote:
> 2010-08-20  Jakub Jelinek  <jakub@redhat.com>
> 	    Paolo Bonzini  <bonzini@gnu.org>
> 
> 	* simplify-rtx.c (simplify_unary_operation_1): Optimize
> 	(sign_extend (zero_extend ()) and
> 	({sign,zero}_extend (lshiftrt (ashift X (const_int I)) (const_int I))).

Ok.


r~
H.J. Lu - Aug. 20, 2010, 4:56 p.m.
On Fri, Aug 20, 2010 at 6:50 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Fri, Aug 20, 2010 at 11:46:48AM +0200, Paolo Bonzini wrote:
>> On 08/19/2010 06:33 PM, Jakub Jelinek wrote:
>> >+      /* (sign_extend:M (sign_extend:N<X>)) is (sign_extend:M<X>).  */
>> >+      if (GET_CODE (op) == SIGN_EXTEND)
>> >+    return simplify_gen_unary (SIGN_EXTEND, mode, XEXP (op, 0),
>> >+                               GET_MODE (XEXP (op, 0)));
>>
>> Maybe
>>
>> /* (sign_extend:M (sign_extend:N<X>)) is (sign_extend:M<X>).
>>    (sign_extend:M (zero_extend:N<X>)) is (zero_extend:M<X>).  */
>> if (GET_CODE (op) == SIGN_EXTEND || GET_CODE (op) == ZERO_EXTEND)
>>   {
>>     gcc_assert (GET_MODE (op) != mode);
>>     return simplify_gen_unary (GET_CODE (op), mode, XEXP (op, 0),
>>                                GET_MODE (XEXP (op, 0)));
>>   }
>>
>> ?
>
> Good idea.
>
> On Thu, Aug 19, 2010 at 07:58:24PM +0200, Bernd Schmidt wrote:
>> > ZERO_EXTEND is expanded into AND instead of LSHIFTRT/ASHIFT by combine,
>> > and
>> > two nested ANDs should be easily optimized.  If you really want, I can
>> > handle the ZERO_EXTEND (AND <X> mask) case as a follow-up.
>>
>> Oh, in expand_compound_operation?  I was thinking about finding such
>> shifts in a pair of insns (happens on Thumb).  Also, what about callers
>> other than combine?
>
> Ok.  The following patch thus adds optimization of sign_extend of zero_extend,
> sign_extend of lshiftrt/ashift and zero_extend of lshiftrt/ashift.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux (yes,rtl checking)
> with additional patchlet to gather statistics and in both
> bootstraps/regtests together this optimization happened (the fact that
> simplify-rtx.c simplified it doesn't still mean the result was usable in
> whatever pass asked for it)     times:
> (sign_extend (sign_extend))      2059
> (sign_extend (zero_extend))     87019
> (sign_extend (ashiftrt))         1485
> (zero_extend (zero_extend))     15648
> lshiftrt wasn't seen, but it is very well possible it shows up on other
> targets.  Ok for trunk?
>
> 2010-08-20  Jakub Jelinek  <jakub@redhat.com>
>            Paolo Bonzini  <bonzini@gnu.org>
>
>        * simplify-rtx.c (simplify_unary_operation_1): Optimize
>        (sign_extend (zero_extend ()) and
>        ({sign,zero}_extend (lshiftrt (ashift X (const_int I)) (const_int I))).
>

Can we optimize out sign_extend in:

(insn:TI 9 7 10 (set (reg:SI 0 ax [orig:68 D.6819 ] [68])
        (zero_extend:SI (vec_select:QI (reg:V16QI 21 xmm0 [orig:64 x ] [64])
                (parallel [
                        (const_int 4 [0x4])
                    ]))))
/export/build/gnu/gcc/build-x86_64-linux/gcc/include/smmintrin.h:442 1681
{*sse4_1_pextrb}
     (expr_list:REG_DEAD (reg:V16QI 21 xmm0 [orig:64 x ] [64])
        (nil)))

(insn:TI 10 9 18 (set (reg:DI 0 ax [orig:67 D.6819 ] [67])
        (sign_extend:DI (reg:SI 0 ax [orig:68 D.6819 ] [68]))) foo.c:3 126
{extendsidi2_rex64}
     (nil))
Jakub Jelinek - Aug. 20, 2010, 5:27 p.m.
On Fri, Aug 20, 2010 at 09:56:41AM -0700, H.J. Lu wrote:
> Can we optimize out sign_extend in:
> 
> (insn:TI 9 7 10 (set (reg:SI 0 ax [orig:68 D.6819 ] [68])
>         (zero_extend:SI (vec_select:QI (reg:V16QI 21 xmm0 [orig:64 x ] [64])
>                 (parallel [
>                         (const_int 4 [0x4])
>                     ]))))
> /export/build/gnu/gcc/build-x86_64-linux/gcc/include/smmintrin.h:442 1681
> {*sse4_1_pextrb}
>      (expr_list:REG_DEAD (reg:V16QI 21 xmm0 [orig:64 x ] [64])
>         (nil)))
> 
> (insn:TI 10 9 18 (set (reg:DI 0 ax [orig:67 D.6819 ] [67])
>         (sign_extend:DI (reg:SI 0 ax [orig:68 D.6819 ] [68]))) foo.c:3 126
> {extendsidi2_rex64}
>      (nil))

I guess only if there is a
  [(set (match_operand:DI 0 "register_operand" "=r")
        (zero_extend:DI
          (vec_select:QI
            (match_operand:V16QI 1 "register_operand" "x")
            (parallel [(match_operand:SI 2 "const_0_to_15_operand" "n")]))))]
pattern for TARGET_64BIT.
Not sure what exactly is
pextrb ..., %ecx
insn doing to the upper 32 bits of %rcx, if it clears them, such an insn
would be nice to have in sse.md (similarly for pextrw/pextrd).
If there is such a pattern, I think with my simplify_* changes the combiner
will do its job.

	Jakub
H.J. Lu - Aug. 20, 2010, 5:35 p.m.
On Fri, Aug 20, 2010 at 10:27 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Fri, Aug 20, 2010 at 09:56:41AM -0700, H.J. Lu wrote:
>> Can we optimize out sign_extend in:
>>
>> (insn:TI 9 7 10 (set (reg:SI 0 ax [orig:68 D.6819 ] [68])
>>         (zero_extend:SI (vec_select:QI (reg:V16QI 21 xmm0 [orig:64 x ] [64])
>>                 (parallel [
>>                         (const_int 4 [0x4])
>>                     ]))))
>> /export/build/gnu/gcc/build-x86_64-linux/gcc/include/smmintrin.h:442 1681
>> {*sse4_1_pextrb}
>>      (expr_list:REG_DEAD (reg:V16QI 21 xmm0 [orig:64 x ] [64])
>>         (nil)))
>>
>> (insn:TI 10 9 18 (set (reg:DI 0 ax [orig:67 D.6819 ] [67])
>>         (sign_extend:DI (reg:SI 0 ax [orig:68 D.6819 ] [68]))) foo.c:3 126
>> {extendsidi2_rex64}
>>      (nil))
>
> I guess only if there is a
>  [(set (match_operand:DI 0 "register_operand" "=r")
>        (zero_extend:DI
>          (vec_select:QI
>            (match_operand:V16QI 1 "register_operand" "x")
>            (parallel [(match_operand:SI 2 "const_0_to_15_operand" "n")]))))]
> pattern for TARGET_64BIT.
> Not sure what exactly is
> pextrb ..., %ecx
> insn doing to the upper 32 bits of %rcx, if it clears them, such an insn
> would be nice to have in sse.md (similarly for pextrw/pextrd).
> If there is such a pattern, I think with my simplify_* changes the combiner
> will do its job.
>

We can certain add them. I will give it a try.
Paolo Bonzini - Aug. 20, 2010, 6 p.m.
On 08/20/2010 07:27 PM, Jakub Jelinek wrote:
> Not sure what exactly is
> pextrb ..., %ecx
> insn doing to the upper 32 bits of %rcx, if it clears them

Probably yes like every other 32-bit writeback on x86_64.

Paolo

Patch

--- gcc/simplify-rtx.c.jj	2010-08-20 12:33:30.000000000 +0200
+++ gcc/simplify-rtx.c	2010-08-20 13:11:15.000000000 +0200
@@ -1010,15 +1010,22 @@  simplify_unary_operation_1 (enum rtx_cod
 	  && GET_MODE_SIZE (mode) <= GET_MODE_SIZE (GET_MODE (XEXP (op, 0))))
 	return rtl_hooks.gen_lowpart_no_emit (mode, op);
 
-      /* (sign_extend:M (sign_extend:N <X>)) is (sign_extend:M <X>).  */
-      if (GET_CODE (op) == SIGN_EXTEND)
-	return simplify_gen_unary (SIGN_EXTEND, mode, XEXP (op, 0),
-				   GET_MODE (XEXP (op, 0)));
+      /* (sign_extend:M (sign_extend:N <X>)) is (sign_extend:M <X>).
+	 (sign_extend:M (zero_extend:N <X>)) is (zero_extend:M <X>).  */
+      if (GET_CODE (op) == SIGN_EXTEND || GET_CODE (op) == ZERO_EXTEND)
+	{
+	  gcc_assert (GET_MODE_BITSIZE (mode)
+		      > GET_MODE_BITSIZE (GET_MODE (op)));
+	  return simplify_gen_unary (GET_CODE (op), mode, XEXP (op, 0),
+				     GET_MODE (XEXP (op, 0)));
+	}
 
       /* (sign_extend:M (ashiftrt:N (ashift <X> (const_int I)) (const_int I)))
 	 is (sign_extend:M (subreg:O <X>)) if there is mode with
-	 GET_MODE_BITSIZE (N) - I bits.  */
-      if (GET_CODE (op) == ASHIFTRT
+	 GET_MODE_BITSIZE (N) - I bits.
+	 (sign_extend:M (lshiftrt:N (ashift <X> (const_int I)) (const_int I)))
+	 is similarly (zero_extend:M (subreg:O <X>)).  */
+      if ((GET_CODE (op) == ASHIFTRT || GET_CODE (op) == LSHIFTRT)
 	  && GET_CODE (XEXP (op, 0)) == ASHIFT
 	  && CONST_INT_P (XEXP (op, 1))
 	  && XEXP (XEXP (op, 0), 1) == XEXP (op, 1)
@@ -1027,11 +1034,15 @@  simplify_unary_operation_1 (enum rtx_cod
 	  enum machine_mode tmode
 	    = mode_for_size (GET_MODE_BITSIZE (GET_MODE (op))
 			     - INTVAL (XEXP (op, 1)), MODE_INT, 1);
+	  gcc_assert (GET_MODE_BITSIZE (mode)
+		      > GET_MODE_BITSIZE (GET_MODE (op)));
 	  if (tmode != BLKmode)
 	    {
 	      rtx inner =
 		rtl_hooks.gen_lowpart_no_emit (tmode, XEXP (XEXP (op, 0), 0));
-	      return simplify_gen_unary (SIGN_EXTEND, mode, inner, tmode);
+	      return simplify_gen_unary (GET_CODE (op) == ASHIFTRT
+					 ? SIGN_EXTEND : ZERO_EXTEND,
+					 mode, inner, tmode);
 	    }
 	}
 
@@ -1066,6 +1077,26 @@  simplify_unary_operation_1 (enum rtx_cod
 	return simplify_gen_unary (ZERO_EXTEND, mode, XEXP (op, 0),
 				   GET_MODE (XEXP (op, 0)));
 
+      /* (zero_extend:M (lshiftrt:N (ashift <X> (const_int I)) (const_int I)))
+	 is (zero_extend:M (subreg:O <X>)) if there is mode with
+	 GET_MODE_BITSIZE (N) - I bits.  */
+      if (GET_CODE (op) == LSHIFTRT
+	  && GET_CODE (XEXP (op, 0)) == ASHIFT
+	  && CONST_INT_P (XEXP (op, 1))
+	  && XEXP (XEXP (op, 0), 1) == XEXP (op, 1)
+	  && GET_MODE_BITSIZE (GET_MODE (op)) > INTVAL (XEXP (op, 1)))
+	{
+	  enum machine_mode tmode
+	    = mode_for_size (GET_MODE_BITSIZE (GET_MODE (op))
+			     - INTVAL (XEXP (op, 1)), MODE_INT, 1);
+	  if (tmode != BLKmode)
+	    {
+	      rtx inner =
+		rtl_hooks.gen_lowpart_no_emit (tmode, XEXP (XEXP (op, 0), 0));
+	      return simplify_gen_unary (ZERO_EXTEND, mode, inner, tmode);
+	    }
+	}
+
 #if defined(POINTERS_EXTEND_UNSIGNED) && !defined(HAVE_ptr_extend)
       /* As we do not know which address space the pointer is refering to,
 	 we can do this only if the target does not support different pointer