Patchwork [ARM,1/3] PR53189: optimizations of 64bit logic operation with constant

login
register
mail settings
Submitter Carrot Wei
Date July 3, 2012, 11:28 a.m.
Message ID <CAEe8uEAoQr94DBV6X8-zVW5Yb64bfi9aHSXOV1_8zVh6L2VjUg@mail.gmail.com>
Download mbox | patch
Permalink /patch/168796/
State New
Headers show

Comments

Carrot Wei - July 3, 2012, 11:28 a.m.
On Thu, Jun 28, 2012 at 12:14 AM, Ramana Radhakrishnan
<ramana.radhakrishnan@linaro.org> wrote:
> On 28 May 2012 11:08, Carrot Wei <carrot@google.com> wrote:
>> Hi
>>
>> This is the second part of the patches that deals with 64bit and. It directly
>> extends the patterns anddi3, anddi3_insn and anddi3_neon to handle 64bit
>> constant operands.
>>
>
> Comments about const_di_ok_for_op still apply from my review of your add patch.
>
> However I don't see why and /ior / xor with constants that have either
> the low or high parts set can't be expanded directly into ands of
> subregs with moves of zero's or the original value especially if you
> aren't looking at doing 64 bit operations in neon .With Neon being
> used for 64 bit arithmetic it gets more interesting.
>
> Finally this should target PR target/53189.
>

Hi Ramana

Thanks for the review. Following is the updated patch according to
your comments.

Tested on arm qemu with all arm/thumb neon/non-neon mode combination
without regression.

thanks
Carrot


2012-07-03  Wei Guozhi  <carrot@google.com>

	PR target/53189
	* gcc.target/arm/pr53189-1.c: New testcase.
	* gcc.target/arm/pr53189-2.c: New testcase.
	* gcc.target/arm/pr53189-3.c: New testcase.


2012-07-03  Wei Guozhi  <carrot@google.com>

	PR target/53189
	* config/arm/arm.c (const_ok_for_dimode_op): Handle AND op.
	* config/arm/constraints.md (De): New constraint.
	* config/arm/predicates.md (arm_anddi_operand): New predicate.
	(arm_immediate_anddi_operand): Likewise.
	(anddi_operand): Likewise.
	* config/arm/arm.md (arm_andsi3_insn): Optimization for special
	constants.
	(anddi3): Extend it to handle 64bit constants.
	(anddi3_insn): Likewise.
	* config/arm/neon.md (anddi3_neon): Likewise.
Ramana Radhakrishnan - July 17, 2012, 1:47 p.m.
Carrot,

Sorry about the delayed response.

On 3 July 2012 12:28, Carrot Wei <carrot@google.com> wrote:
> On Thu, Jun 28, 2012 at 12:14 AM, Ramana Radhakrishnan
> <ramana.radhakrishnan@linaro.org> wrote:
>> On 28 May 2012 11:08, Carrot Wei <carrot@google.com> wrote:
>>> Hi
>>>
>>> This is the second part of the patches that deals with 64bit and. It directly
>>> extends the patterns anddi3, anddi3_insn and anddi3_neon to handle 64bit
>>> constant operands.
>>>
>>
>> Comments about const_di_ok_for_op still apply from my review of your add patch.
>>
>> However I don't see why and /ior / xor with constants that have either
>> the low or high parts set can't be expanded directly into ands of
>> subregs with moves of zero's or the original value especially if you
>> aren't looking at doing 64 bit operations in neon .With Neon being
>> used for 64 bit arithmetic it gets more interesting.
>>
>> Finally this should target PR target/53189.
>>
>
> Hi Ramana
>
> Thanks for the review. Following is the updated patch according to
> your comments.

You've missed answering this part of my review :)

>> However I don't see why and /ior / xor with constants that have either
>> the low or high parts set can't be expanded directly into ands of
>> subregs with moves of zero's or the original value especially if you
>> aren't looking at doing 64 bit operations in neon .With Neon being
>> used for 64 bit arithmetic it gets more interesting.

Is there any reason why we don't split such cases earlier into the
constituent moves and the associated ands earlier than reload in the
non-Neon case?

 In addition, it would be good to have some tests for Thumb2 that deal
with the replicated constants for Thumb2 . Can you have a look at
creating some tests similar to the thumb2*replicated*.c tests in
gcc.target/arm but for 64 bit constants ?


regards,
Ramana

Patch

Index: testsuite/gcc.target/arm/pr53189-2.c
===================================================================
--- testsuite/gcc.target/arm/pr53189-2.c	(revision 0)
+++ testsuite/gcc.target/arm/pr53189-2.c	(revision 0)
@@ -0,0 +1,9 @@ 
+/* { dg-options "-O2" }  */
+/* { dg-require-effective-target arm32 } */
+/* { dg-final { scan-assembler-times "mov" 1 } } */
+/* { dg-final { scan-assembler-times "and" 1 } } */
+
+void t0p(long long * p)
+{
+  *p &= 0x100000000;
+}
Index: testsuite/gcc.target/arm/pr53189-3.c
===================================================================
--- testsuite/gcc.target/arm/pr53189-3.c	(revision 0)
+++ testsuite/gcc.target/arm/pr53189-3.c	(revision 0)
@@ -0,0 +1,9 @@ 
+/* { dg-options "-O2" }  */
+/* { dg-require-effective-target arm32 } */
+/* { dg-final { scan-assembler-not "mov" } } */
+/* { dg-final { scan-assembler-times "and" 1 } } */
+
+void t0p(long long * p)
+{
+  *p &= 0x1FFFFFFFF;
+}
Index: testsuite/gcc.target/arm/pr53189-1.c
===================================================================
--- testsuite/gcc.target/arm/pr53189-1.c	(revision 0)
+++ testsuite/gcc.target/arm/pr53189-1.c	(revision 0)
@@ -0,0 +1,8 @@ 
+/* { dg-options "-O2" }  */
+/* { dg-require-effective-target arm32 } */
+/* { dg-final { scan-assembler-not "mov" } } */
+
+void t0p(long long * p)
+{
+  *p &= 0x100000002;
+}
Index: config/arm/arm.c
===================================================================
--- config/arm/arm.c	(revision 189107)
+++ config/arm/arm.c	(working copy)
@@ -2524,6 +2524,10 @@ 
     case PLUS:
       return arm_not_operand (hi, SImode) && arm_add_operand (lo, SImode);

+    case AND:
+      return ((const_ok_for_arm (lo_val) || const_ok_for_arm (~lo_val))
+	      && (const_ok_for_arm (hi_val) || const_ok_for_arm (~hi_val)));
+
     default:
       return 0;
     }
Index: config/arm/neon.md
===================================================================
--- config/arm/neon.md	(revision 189107)
+++ config/arm/neon.md	(working copy)
@@ -776,9 +776,9 @@ 
 )

 (define_insn "anddi3_neon"
-  [(set (match_operand:DI 0 "s_register_operand" "=w,w,?&r,?&r,?w,?w")
-        (and:DI (match_operand:DI 1 "s_register_operand" "%w,0,0,r,w,0")
-		(match_operand:DI 2 "neon_inv_logic_op2" "w,DL,r,r,w,DL")))]
+  [(set (match_operand:DI 0 "s_register_operand" "=w,w,?&r,?&r,?w,?w,?&r,?&r")
+        (and:DI (match_operand:DI 1 "s_register_operand" "%w,0,0,r,w,0,0,r")
+		(match_operand:DI 2 "anddi_operand" "w,DL,r,r,w,DL,De,De")))]
   "TARGET_NEON"
 {
   switch (which_alternative)
@@ -790,12 +790,14 @@ 
     		     DImode, 1, VALID_NEON_QREG_MODE (DImode));
     case 2: return "#";
     case 3: return "#";
+    case 6: return "#";
+    case 7: return "#";
     default: gcc_unreachable ();
     }
 }
-  [(set_attr "neon_type" "neon_int_1,neon_int_1,*,*,neon_int_1,neon_int_1")
-   (set_attr "length" "*,*,8,8,*,*")
-   (set_attr "arch" "nota8,nota8,*,*,onlya8,onlya8")]
+  [(set_attr "neon_type" "neon_int_1,neon_int_1,*,*,neon_int_1,neon_int_1,*,*")
+   (set_attr "length" "*,*,8,8,*,*,8,8")
+   (set_attr "arch" "nota8,nota8,*,*,onlya8,onlya8,*,*")]
 )

 (define_insn "orn<mode>3_neon"
Index: config/arm/constraints.md
===================================================================
--- config/arm/constraints.md	(revision 189107)
+++ config/arm/constraints.md	(working copy)
@@ -31,7 +31,7 @@ 
 ;; 'H' was previously used for FPA.

 ;; The following multi-letter normal constraints have been used:
-;; in ARM/Thumb-2 state: Da, Db, Dc, Dd, Dn, Dl, DL, Dv, Dy, Di, Dt, Dz
+;; in ARM/Thumb-2 state: Da, Db, Dc, Dd, De, Dn, Dl, DL, Dv, Dy, Di, Dt, Dz
 ;; in Thumb-1 state: Pa, Pb, Pc, Pd, Pe
 ;; in Thumb-2 state: Pj, PJ, Ps, Pt, Pu, Pv, Pw, Px, Py

@@ -248,6 +248,12 @@ 
  (and (match_code "const_int")
       (match_test "TARGET_32BIT && const_ok_for_dimode_op (ival, PLUS)")))

+(define_constraint "De"
+ "@internal
+  In ARM/Thumb-2 state a const_int that can be used by anddi3_insn."
+ (and (match_code "const_int")
+      (match_test "TARGET_32BIT && const_ok_for_dimode_op (ival, AND)")))
+
 (define_constraint "Di"
  "@internal
   In ARM/Thumb-2 state a const_int or const_double where both the high
Index: config/arm/predicates.md
===================================================================
--- config/arm/predicates.md	(revision 189107)
+++ config/arm/predicates.md	(working copy)
@@ -104,6 +104,14 @@ 
   (and (match_code "const_int,const_double")
        (match_test "arm_const_double_by_immediates (op)")))

+(define_predicate "arm_immediate_anddi_operand"
+  (and (match_code "const_int")
+       (match_test "const_ok_for_dimode_op (INTVAL (op), AND)")))
+
+(define_predicate "arm_anddi_operand"
+  (ior (match_operand 0 "arm_immediate_anddi_operand")
+       (match_operand 0 "s_register_operand")))
+
 (define_predicate "arm_neg_immediate_operand"
   (and (match_code "const_int")
        (match_test "const_ok_for_arm (-INTVAL (op))")))
@@ -524,6 +532,10 @@ 
   (ior (match_operand 0 "imm_for_neon_inv_logic_operand")
        (match_operand 0 "s_register_operand")))

+(define_predicate "anddi_operand"
+  (ior (match_operand 0 "neon_inv_logic_op2")
+       (match_operand 0 "arm_anddi_operand")))
+
 ;; TODO: We could check lane numbers more precisely based on the mode.
 (define_predicate "neon_lane_number"
   (and (match_code "const_int")
Index: config/arm/arm.md
===================================================================
--- config/arm/arm.md	(revision 189107)
+++ config/arm/arm.md	(working copy)
@@ -2046,17 +2046,30 @@ 
 (define_expand "anddi3"
   [(set (match_operand:DI         0 "s_register_operand" "")
 	(and:DI (match_operand:DI 1 "s_register_operand" "")
-		(match_operand:DI 2 "neon_inv_logic_op2" "")))]
+		(match_operand:DI 2 "anddi_operand" "")))]
   "TARGET_32BIT"
   ""
 )

-(define_insn "*anddi3_insn"
-  [(set (match_operand:DI         0 "s_register_operand" "=&r,&r")
-	(and:DI (match_operand:DI 1 "s_register_operand"  "%0,r")
-		(match_operand:DI 2 "s_register_operand"   "r,r")))]
+(define_insn_and_split "*anddi3_insn"
+  [(set (match_operand:DI         0 "s_register_operand" "=&r,&r,&r,&r")
+	(and:DI (match_operand:DI 1 "s_register_operand"  "%0,r,0,r")
+		(match_operand:DI 2 "arm_anddi_operand"   "r,r,De,De")))]
   "TARGET_32BIT && !TARGET_IWMMXT && !TARGET_NEON"
   "#"
+  "TARGET_32BIT && !TARGET_IWMMXT && reload_completed
+   && !(TARGET_NEON && IS_VFP_REGNUM (REGNO (operands[0])))"
+  [(set (match_dup 0) (and:SI (match_dup 1) (match_dup 2)))
+   (set (match_dup 3) (and:SI (match_dup 4) (match_dup 5)))]
+  "
+  {
+    operands[3] = gen_highpart (SImode, operands[0]);
+    operands[0] = gen_lowpart (SImode, operands[0]);
+    operands[4] = gen_highpart (SImode, operands[1]);
+    operands[1] = gen_lowpart (SImode, operands[1]);
+    operands[5] = gen_highpart_mode (SImode, DImode, operands[2]);
+    operands[2] = gen_lowpart (SImode, operands[2]);
+  }"
   [(set_attr "length" "8")]
 )

@@ -2176,10 +2189,29 @@ 
 	(and:SI (match_operand:SI 1 "s_register_operand" "r,r,r")
 		(match_operand:SI 2 "reg_or_int_operand" "rI,K,?n")))]
   "TARGET_32BIT"
-  "@
-   and%?\\t%0, %1, %2
-   bic%?\\t%0, %1, #%B2
-   #"
+  "*
+  {
+    if (CONST_INT_P (operands[2]))
+      {
+	HOST_WIDE_INT i = INTVAL (operands[2]) & 0xFFFFFFFF;
+	if (i == 0)
+	  return \"mov%?\\t%0, #0\";
+	if (i == 0xFFFFFFFF)
+	  {
+	    if (!rtx_equal_p (operands[0], operands[1]))
+	      return \"mov%?\\t%0, %1\";
+	    else
+	      return \"\";
+	  }
+      }
+
+    switch (which_alternative)
+      {
+      case 0: return \"and%?\\t%0, %1, %2\";
+      case 1: return \"bic%?\\t%0, %1, #%B2\";
+      case 2: return \"#\";
+      }
+  }"
   "TARGET_32BIT
    && GET_CODE (operands[2]) == CONST_INT
    && !(const_ok_for_arm (INTVAL (operands[2]))