Patchwork ARM: More reorganization of extend patterns (PR43137)

login
register
mail settings
Submitter Bernd Schmidt
Date July 13, 2010, 9:56 a.m.
Message ID <4C3C383E.8060604@codesourcery.com>
Download mbox | patch
Permalink /patch/58731/
State New
Headers show

Comments

Bernd Schmidt - July 13, 2010, 9:56 a.m.
This patch merges a number of different extend:DI patterns by using a
QHSI mode_iterator.  It also removes separate patterns from thumb2.md as
we can just set attr ce_count for ARM patterns as well.  I've completed
what I'd partially done in my previous patch, which is to remove
unnecessary constant pool handling from these patterns - none of them
accept constants.

To fix PR43137, I've added splitters that run before the subreg2 pass.
In general, the ARM backend makes practically no use of lower-subreg; I
have given up on fixing that for the moment as it was more effort than I
can spare right now.  Just doing it for these patterns is enough to be a
win in certain cases.

Regression tested (together with other patches) on
    qemu-system-armv7/arch=armv7-a/thumb
    qemu-system-armv7/thumb
    qemu-system-armv7

Ok?


Bernd
PR target/43137
	* config/arm/arm.md (zero_extendsidi2, arm_zero_extendsidi2,
	arm_exxtendsidi2, arm_extendsidi2): Delete patterns.
	(qhs_zextenddi_cond, qhs_sextenddi_cond): New define_mode_attrs.
	(zero_extend<mode>di2, extend<mode>di2 and related splits): New.
	(thumb1_zero_extendhisi2): Remove code to handle LABEL_REFs.
	Remove pool_range attribute.
	(arm_zero_extendhisi2, arm_zero_extendhisi2_v6, arm_zero_extendqisi2,
	arm_zero_extendqisi2_v6, thumb1_zero_extendqisi2_v6): Remove
	pool_range and neg_pool_range attributes.
	* config/arm/thumb2.md (thumb2_zero_extendsidi2,
	thumb2_zero_extendhidi2, thumb2_zero_extendqidi2, thumb2_extendsidi2,
	thumb2_extendhidi2, thumb2_extendqidi2): Delete.

	PR target/43137
	* gcc.target/arm/pr43137.c: New test.
Bernd Schmidt - Aug. 4, 2010, 2:56 p.m.
On 07/13/2010 11:56 AM, Bernd Schmidt wrote:
> This patch merges a number of different extend:DI patterns by using a
> QHSI mode_iterator.  It also removes separate patterns from thumb2.md as
> we can just set attr ce_count for ARM patterns as well.  I've completed
> what I'd partially done in my previous patch, which is to remove
> unnecessary constant pool handling from these patterns - none of them
> accept constants.
> 
> To fix PR43137, I've added splitters that run before the subreg2 pass.
> In general, the ARM backend makes practically no use of lower-subreg; I
> have given up on fixing that for the moment as it was more effort than I
> can spare right now.  Just doing it for these patterns is enough to be a
> win in certain cases.

  http://gcc.gnu.org/ml/gcc-patches/2010-07/msg01036.html


Bernd
Bernd Schmidt - Aug. 24, 2010, 10:26 a.m.
On 08/04/2010 04:56 PM, Bernd Schmidt wrote:
> On 07/13/2010 11:56 AM, Bernd Schmidt wrote:
>> This patch merges a number of different extend:DI patterns by using a
>> QHSI mode_iterator.  It also removes separate patterns from thumb2.md as
>> we can just set attr ce_count for ARM patterns as well.  I've completed
>> what I'd partially done in my previous patch, which is to remove
>> unnecessary constant pool handling from these patterns - none of them
>> accept constants.
>>
>> To fix PR43137, I've added splitters that run before the subreg2 pass.
>> In general, the ARM backend makes practically no use of lower-subreg; I
>> have given up on fixing that for the moment as it was more effort than I
>> can spare right now.  Just doing it for these patterns is enough to be a
>> win in certain cases.

  http://gcc.gnu.org/ml/gcc-patches/2010-07/msg01036.html


Bernd
Richard Earnshaw - Sept. 1, 2010, 3:52 p.m.
On Tue, 2010-07-13 at 10:56 +0100, Bernd Schmidt wrote:
> This patch merges a number of different extend:DI patterns by using a
> QHSI mode_iterator.  It also removes separate patterns from thumb2.md
> as
> we can just set attr ce_count for ARM patterns as well.  I've
> completed
> what I'd partially done in my previous patch, which is to remove
> unnecessary constant pool handling from these patterns - none of them
> accept constants.
> 
> To fix PR43137, I've added splitters that run before the subreg2 pass.
> In general, the ARM backend makes practically no use of lower-subreg;
> I
> have given up on fixing that for the moment as it was more effort than
> I
> can spare right now.  Just doing it for these patterns is enough to be
> a
> win in certain cases.
> 
> Regression tested (together with other patches) on
>     qemu-system-armv7/arch=armv7-a/thumb
>     qemu-system-armv7/thumb
>     qemu-system-armv7
> 
> Ok?
> 

The define_mode_attrs need to be moved to iterators.md.  Also, a number
of patterns have been changed (inconsistently with the rest of arm.md)
to not have the terminating ')' in column 1.

Otherwise, OK.

R.

Patch

Index: gcc/config/arm/arm.md
===================================================================
--- gcc.orig/config/arm/arm.md
+++ gcc/config/arm/arm.md
@@ -3990,69 +3990,82 @@ 
 
 ;; Zero and sign extension instructions.
 
-(define_expand "zero_extendsidi2"
-  [(set (match_operand:DI 0 "s_register_operand" "")
-        (zero_extend:DI (match_operand:SI 1 "s_register_operand" "")))]
-  "TARGET_32BIT"
-  ""
-)
+(define_mode_attr qhs_zextenddi_cond [(SI "") (HI "&& arm_arch6") (QI "")])
+(define_mode_attr qhs_sextenddi_cond [(SI "") (HI "&& arm_arch6")
+				      (QI "&& arm_arch6")])
 
-(define_insn "*arm_zero_extendsidi2"
+(define_insn "zero_extend<mode>di2"
   [(set (match_operand:DI 0 "s_register_operand" "=r")
-        (zero_extend:DI (match_operand:SI 1 "s_register_operand" "r")))]
-  "TARGET_ARM"
-  "*
-    if (REGNO (operands[1])
-        != REGNO (operands[0]) + (WORDS_BIG_ENDIAN ? 1 : 0))
-      output_asm_insn (\"mov%?\\t%Q0, %1\", operands);
-    return \"mov%?\\t%R0, #0\";
-  "
+        (zero_extend:DI (match_operand:QHSI 1 "nonimmediate_operand" "rm")))]
+  "TARGET_32BIT <qhs_zextenddi_cond>"
+  "#"
   [(set_attr "length" "8")
-   (set_attr "predicable" "yes")]
-)
+   (set_attr "ce_count" "2")
+   (set_attr "predicable" "yes")])
 
-(define_expand "zero_extendqidi2"
-  [(set (match_operand:DI                 0 "s_register_operand"  "")
-	(zero_extend:DI (match_operand:QI 1 "nonimmediate_operand" "")))]
-  "TARGET_32BIT"
-  ""
-)
-
-(define_insn "*arm_zero_extendqidi2"
-  [(set (match_operand:DI                 0 "s_register_operand"  "=r,r")
-	(zero_extend:DI (match_operand:QI 1 "nonimmediate_operand" "r,m")))]
-  "TARGET_ARM"
-  "@
-   and%?\\t%Q0, %1, #255\;mov%?\\t%R0, #0
-   ldr%(b%)\\t%Q0, %1\;mov%?\\t%R0, #0"
+(define_insn "extend<mode>di2"
+  [(set (match_operand:DI 0 "s_register_operand" "=r")
+        (sign_extend:DI (match_operand:QHSI 1 "nonimmediate_operand" "rm")))]
+  "TARGET_32BIT <qhs_sextenddi_cond>"
+  "#"
   [(set_attr "length" "8")
-   (set_attr "predicable" "yes")
-   (set_attr "type" "*,load_byte")
-   (set_attr "pool_range" "*,4092")
-   (set_attr "neg_pool_range" "*,4084")]
-)
+   (set_attr "ce_count" "2")
+   (set_attr "shift" "1")
+   (set_attr "predicable" "yes")])
 
-(define_expand "extendsidi2"
+;; Splits for all extensions to DImode
+(define_split
   [(set (match_operand:DI 0 "s_register_operand" "")
-        (sign_extend:DI (match_operand:SI 1 "s_register_operand" "")))]
+        (zero_extend:DI (match_operand 1 "nonimmediate_operand" "")))]
   "TARGET_32BIT"
-  ""
-)
+  [(set (match_dup 0) (match_dup 1))]
+{
+  rtx insn;
+  rtx lo_part = gen_lowpart (SImode, operands[0]);
+  enum machine_mode src_mode = GET_MODE (operands[1]);
+
+  if (REG_P (operands[0])
+      && !reg_overlap_mentioned_p (operands[0], operands[1]))
+    emit_clobber (operands[0]);
+  if (!REG_P (lo_part) || src_mode != SImode
+      || !rtx_equal_p (lo_part, operands[1]))
+    {
+      if (src_mode == SImode)
+        emit_move_insn (lo_part, operands[1]);
+      else
+        emit_insn (gen_rtx_SET (VOIDmode, lo_part,
+				gen_rtx_ZERO_EXTEND (SImode, operands[1])));
+      operands[1] = lo_part;
+    }
+  operands[0] = gen_highpart (SImode, operands[0]);
+  operands[1] = const0_rtx;
+})
 
-(define_insn "*arm_extendsidi2"
-  [(set (match_operand:DI 0 "s_register_operand" "=r")
-        (sign_extend:DI (match_operand:SI 1 "s_register_operand" "r")))]
-  "TARGET_ARM"
-  "*
-    if (REGNO (operands[1])
-        != REGNO (operands[0]) + (WORDS_BIG_ENDIAN ? 1 : 0))
-      output_asm_insn (\"mov%?\\t%Q0, %1\", operands);
-    return \"mov%?\\t%R0, %Q0, asr #31\";
-  "
-  [(set_attr "length" "8")
-   (set_attr "shift" "1")
-   (set_attr "predicable" "yes")]
-)
+(define_split
+  [(set (match_operand:DI 0 "s_register_operand" "")
+        (sign_extend:DI (match_operand 1 "nonimmediate_operand" "")))]
+  "TARGET_32BIT"
+  [(set (match_dup 0) (ashiftrt:SI (match_dup 1) (const_int 31)))]
+{
+  rtx lo_part = gen_lowpart (SImode, operands[0]);
+  enum machine_mode src_mode = GET_MODE (operands[1]);
+
+  if (REG_P (operands[0])
+      && !reg_overlap_mentioned_p (operands[0], operands[1]))
+    emit_clobber (operands[0]);
+
+  if (!REG_P (lo_part) || src_mode != SImode
+      || !rtx_equal_p (lo_part, operands[1]))
+    {
+      if (src_mode == SImode)
+        emit_move_insn (lo_part, operands[1]);
+      else
+        emit_insn (gen_rtx_SET (VOIDmode, lo_part,
+				gen_rtx_SIGN_EXTEND (SImode, operands[1])));
+      operands[1] = lo_part;
+    }
+  operands[0] = gen_highpart (SImode, operands[0]);
+})
 
 (define_expand "zero_extendhisi2"
   [(set (match_operand:SI 0 "s_register_operand" "")
@@ -4088,26 +4101,22 @@ 
   [(set (match_operand:SI 0 "register_operand" "=l,l")
 	(zero_extend:SI (match_operand:HI 1 "nonimmediate_operand" "l,m")))]
   "TARGET_THUMB1"
-  "*
+{
   rtx mem;
 
   if (which_alternative == 0 && arm_arch6)
-    return \"uxth\\t%0, %1\";
+    return "uxth\t%0, %1";
   if (which_alternative == 0)
-    return \"#\";
+    return "#";
 
   mem = XEXP (operands[1], 0);
 
   if (GET_CODE (mem) == CONST)
     mem = XEXP (mem, 0);
     
-  if (GET_CODE (mem) == LABEL_REF)
-    return \"ldr\\t%0, %1\";
-    
   if (GET_CODE (mem) == PLUS)
     {
       rtx a = XEXP (mem, 0);
-      rtx b = XEXP (mem, 1);
 
       /* This can happen due to bugs in reload.  */
       if (GET_CODE (a) == REG && REGNO (a) == SP_REGNUM)
@@ -4116,25 +4125,19 @@ 
           ops[0] = operands[0];
           ops[1] = a;
       
-          output_asm_insn (\"mov	%0, %1\", ops);
+          output_asm_insn ("mov\t%0, %1", ops);
 
           XEXP (mem, 0) = operands[0];
        }
-
-      else if (   GET_CODE (a) == LABEL_REF
-	       && GET_CODE (b) == CONST_INT)
-        return \"ldr\\t%0, %1\";
     }
     
-  return \"ldrh\\t%0, %1\";
-  "
+  return "ldrh\t%0, %1";
+}
   [(set_attr_alternative "length"
 			 [(if_then_else (eq_attr "is_arch6" "yes")
 				       (const_int 2) (const_int 4))
 			 (const_int 4)])
-   (set_attr "type" "alu_shift,load_byte")
-   (set_attr "pool_range" "*,60")]
-)
+   (set_attr "type" "alu_shift,load_byte")])
 
 (define_insn "*arm_zero_extendhisi2"
   [(set (match_operand:SI 0 "s_register_operand" "=r,r")
@@ -4144,10 +4147,7 @@ 
    #
    ldr%(h%)\\t%0, %1"
   [(set_attr "type" "alu_shift,load_byte")
-   (set_attr "predicable" "yes")
-   (set_attr "pool_range" "*,256")
-   (set_attr "neg_pool_range" "*,244")]
-)
+   (set_attr "predicable" "yes")])
 
 (define_insn "*arm_zero_extendhisi2_v6"
   [(set (match_operand:SI 0 "s_register_operand" "=r,r")
@@ -4157,10 +4157,7 @@ 
    uxth%?\\t%0, %1
    ldr%(h%)\\t%0, %1"
   [(set_attr "type" "alu_shift,load_byte")
-   (set_attr "predicable" "yes")
-   (set_attr "pool_range" "*,256")
-   (set_attr "neg_pool_range" "*,244")]
-)
+   (set_attr "predicable" "yes")])
 
 (define_insn "*arm_zero_extendhisi2addsi"
   [(set (match_operand:SI 0 "s_register_operand" "=r")
@@ -4228,10 +4225,8 @@ 
   "@
    uxtb\\t%0, %1
    ldrb\\t%0, %1"
-  [(set_attr "length" "2,2")
-   (set_attr "type" "alu_shift,load_byte")
-   (set_attr "pool_range" "*,32")]
-)
+  [(set_attr "length" "2")
+   (set_attr "type" "alu_shift,load_byte")])
 
 (define_insn "*arm_zero_extendqisi2"
   [(set (match_operand:SI 0 "s_register_operand" "=r,r")
@@ -4242,10 +4237,7 @@ 
    ldr%(b%)\\t%0, %1\\t%@ zero_extendqisi2"
   [(set_attr "length" "8,4")
    (set_attr "type" "alu_shift,load_byte")
-   (set_attr "predicable" "yes")
-   (set_attr "pool_range" "*,4096")
-   (set_attr "neg_pool_range" "*,4084")]
-)
+   (set_attr "predicable" "yes")])
 
 (define_insn "*arm_zero_extendqisi2_v6"
   [(set (match_operand:SI 0 "s_register_operand" "=r,r")
@@ -4255,10 +4247,7 @@ 
    uxtb%(%)\\t%0, %1
    ldr%(b%)\\t%0, %1\\t%@ zero_extendqisi2"
   [(set_attr "type" "alu_shift,load_byte")
-   (set_attr "predicable" "yes")
-   (set_attr "pool_range" "*,4096")
-   (set_attr "neg_pool_range" "*,4084")]
-)
+   (set_attr "predicable" "yes")])
 
 (define_insn "*arm_zero_extendqisi2addsi"
   [(set (match_operand:SI 0 "s_register_operand" "=r")
Index: gcc/config/arm/thumb2.md
===================================================================
--- gcc.orig/config/arm/thumb2.md
+++ gcc/config/arm/thumb2.md
@@ -792,145 +792,6 @@ 
 
 ;; Zero and sign extension instructions.
 
-(define_insn_and_split "*thumb2_zero_extendsidi2"
-  [(set (match_operand:DI 0 "s_register_operand" "=r")
-        (zero_extend:DI (match_operand:SI 1 "s_register_operand" "r")))]
-  "TARGET_THUMB2"
-  "mov%?\\t%Q0, %1\;mov%?\\t%R0, #0"
-  "&& reload_completed"
-  [(set (match_dup 0) (match_dup 1))]
-  "
-  {
-    rtx lo_part = gen_lowpart (SImode, operands[0]);
-    if (!REG_P (lo_part) || REGNO (lo_part) != REGNO (operands[1]))
-      emit_move_insn (lo_part, operands[1]);
-    operands[0] = gen_highpart (SImode, operands[0]);
-    operands[1] = const0_rtx;
-  }
-  "
-  [(set_attr "length" "8")
-   (set_attr "ce_count" "2")
-   (set_attr "predicable" "yes")]
-)
-
-(define_insn_and_split "*thumb2_zero_extendhidi2"
-  [(set (match_operand:DI                 0 "s_register_operand"  "=r,r")
-	(zero_extend:DI (match_operand:HI 1 "nonimmediate_operand" "r,m")))]
-  "TARGET_THUMB2"
-  "@
-   uxth%?\\t%Q0, %1\;mov%?\\t%R0, #0
-   ldr%(h%)\\t%Q0, %1\;mov%?\\t%R0, #0"
-  "&& reload_completed"
-  [(set (match_dup 0) (zero_extend:SI (match_dup 1)))
-   (set (match_dup 2) (match_dup 3))]
-  "
-  {
-    operands[2] = gen_highpart (SImode, operands[0]);
-    operands[0] = gen_lowpart (SImode, operands[0]);
-    operands[3] = const0_rtx;
-  }
-  "
-  [(set_attr "length" "8")
-   (set_attr "ce_count" "2")
-   (set_attr "predicable" "yes")
-   (set_attr "type" "*,load_byte")
-   (set_attr "pool_range" "*,4092")
-   (set_attr "neg_pool_range" "*,250")]
-)
-
-(define_insn_and_split "*thumb2_zero_extendqidi2"
-  [(set (match_operand:DI                 0 "s_register_operand"  "=r,r")
-	(zero_extend:DI (match_operand:QI 1 "nonimmediate_operand" "r,m")))]
-  "TARGET_THUMB2"
-  "@
-   uxtb%?\\t%Q0, %1\;mov%?\\t%R0, #0
-   ldr%(b%)\\t%Q0, %1\;mov%?\\t%R0, #0"
-  "&& reload_completed"
-  [(set (match_dup 0) (zero_extend:SI (match_dup 1)))
-   (set (match_dup 2) (match_dup 3))]
-  "
-  {
-    operands[2] = gen_highpart (SImode, operands[0]);
-    operands[0] = gen_lowpart (SImode, operands[0]);
-    operands[3] = const0_rtx;
-  }
-  "
-  [(set_attr "length" "8")
-   (set_attr "ce_count" "2")
-   (set_attr "predicable" "yes")
-   (set_attr "type" "*,load_byte")
-   (set_attr "pool_range" "*,4092")
-   (set_attr "neg_pool_range" "*,250")]
-)
-
-(define_insn_and_split "*thumb2_extendsidi2"
-  [(set (match_operand:DI 0 "s_register_operand" "=r")
-        (sign_extend:DI (match_operand:SI 1 "s_register_operand" "r")))]
-  "TARGET_THUMB2"
-  "mov%?\\t%Q0, %1\;asr?\\t%R0, %1, #31"
-  "&& reload_completed"
-  [(set (match_dup 0) (ashiftrt:SI (match_dup 1) (const_int 31)))]
-  {
-    rtx lo_part = gen_lowpart (SImode, operands[0]);
-
-    if (!REG_P (lo_part) || REGNO (lo_part) != REGNO (operands[1]))
-      emit_move_insn (lo_part, operands[1]);
-    operands[0] = gen_highpart (SImode, operands[0]);
-  }
-  [(set_attr "length" "8")
-   (set_attr "ce_count" "2")
-   (set_attr "shift" "1")
-   (set_attr "predicable" "yes")]
-)
-
-(define_insn_and_split "*thumb2_extendhidi2"
-  [(set (match_operand:DI                 0 "s_register_operand"  "=r,r")
-	(sign_extend:DI (match_operand:HI 1 "nonimmediate_operand" "r,m")))]
-  "TARGET_THUMB2"
-  "@
-   sxth%?\\t%Q0, %1\;asr%?\\t%R0, %Q0, #31
-   ldrsh%?\\t%Q0, %1\;asr%?\\t%R0, %Q0, #31"
-  "&& reload_completed"
-  [(set (match_dup 0) (sign_extend:SI (match_dup 1)))
-   (set (match_dup 2) (ashiftrt:SI (match_dup 0) (const_int 31)))]
-  "
-  {
-    operands[2] = gen_highpart (SImode, operands[0]);
-    operands[0] = gen_lowpart (SImode, operands[0]);
-  }
-  "
-  [(set_attr "length" "8")
-   (set_attr "ce_count" "2")
-   (set_attr "predicable" "yes")
-   (set_attr "type" "*,load_byte")
-   (set_attr "pool_range" "*,4092")
-   (set_attr "neg_pool_range" "*,250")]
-)
-
-(define_insn_and_split "*thumb2_extendqidi2"
-  [(set (match_operand:DI                 0 "s_register_operand"  "=r,r")
-	(sign_extend:DI (match_operand:QI 1 "nonimmediate_operand" "r,m")))]
-  "TARGET_THUMB2"
-  "@
-   sxtb%?\\t%Q0, %1\;asr%?\\t%R0, %Q0, #31
-   ldrsb%?\\t%Q0, %1\;asr%?\\t%R0, %Q0, #31"
-  "&& reload_completed"
-  [(set (match_dup 0) (sign_extend:SI (match_dup 1)))
-   (set (match_dup 2) (ashiftrt:SI (match_dup 0) (const_int 31)))]
-  "
-  {
-    operands[2] = gen_highpart (SImode, operands[0]);
-    operands[0] = gen_lowpart (SImode, operands[0]);
-  }
-  "
-  [(set_attr "length" "8")
-   (set_attr "ce_count" "2")
-   (set_attr "predicable" "yes")
-   (set_attr "type" "*,load_byte")
-   (set_attr "pool_range" "*,4092")
-   (set_attr "neg_pool_range" "*,250")]
-)
-
 ;; All supported Thumb2 implementations are armv6, so only that case is
 ;; provided.
 (define_insn "*thumb2_extendqisi_v6"
Index: testsuite/gcc.target/arm/pr43137.c
===================================================================
--- testsuite/gcc.target/arm/pr43137.c	(revision 0)
+++ testsuite/gcc.target/arm/pr43137.c	(revision 0)
@@ -0,0 +1,9 @@ 
+/* { dg-options "-O2" }  */
+/* { dg-final { scan-assembler-not "mov\tr1, r\[1-9\]" } } */
+
+int foo();
+long long bar22()
+{
+  int result = foo();
+  return result;
+}