diff mbox

[ARM] Thumb-2 12-bit immediates in ADD and SUB instructions

Message ID 4DD63839.7040806@ispras.ru
State New
Headers show

Commit Message

Dmitry Plotnikov May 20, 2011, 9:45 a.m. UTC
This patch adds support for 12-bit immediate values for Thumb-2 in ADD and
SUB instructions.  We added two new alternatives for *arm_addsi3 which
make use of two new constraints for 12-bit values.  Also we modified
costs of PLUS rtx expression.
This patch reduces size of libevas by 1788 bytes (from 464916 to
463128), and sqlite by 54 bytes (from 266156 to 266052).
Regtested with Cortex-A8 QEMU.

Ok for trunk?

Comments

Richard Earnshaw May 20, 2011, 10:10 a.m. UTC | #1
On Fri, 2011-05-20 at 18:06 +0200, Chung-Lin Tang wrote:
> On 2011/5/20 11:45 AM, Dmitry Plotnikov wrote:
> > This patch adds support for 12-bit immediate values for Thumb-2 in ADD and
> > SUB instructions.  We added two new alternatives for *arm_addsi3 which
> > make use of two new constraints for 12-bit values.  Also we modified
> > costs of PLUS rtx expression.
> > This patch reduces size of libevas by 1788 bytes (from 464916 to
> > 463128), and sqlite by 54 bytes (from 266156 to 266052).
> > Regtested with Cortex-A8 QEMU.
> > 
> > Ok for trunk?
> > 
> 
> Andrew Stubbs seem to have another patch related to ADDW/SUBW support,
> which I think is not yet committed to trunk. I have not yet studied how
> this patch and Andrew's relate.
> 
> That aside, I think the style of adding new alternatives for this
> purpose is a little unneeded. I suggest:
> 
> 1) Abstract out const_ok_for_arm() into const_ok_for_arm_outer() with an
> OUTER rtx code argument, and a const_ok_for_arm() with OUTER passed 0.
> 
> 2) Within const_ok_for_arm_outer(), test for OUTER==PLUS and
> TARGET_THUMB2 as needed.
> 
> 3) Migrate from const_ok_for_arm() to const_ok_for_arm_outer() as
> needed: in pattern conditions, etc.

We already have const_ok_for_op, which already does this sort of thing.

R.

> 
> I'll also note here that ADD/SUB are not the only instructions with
> 12-bit immediate under Thumb-2; so does AND, ORR, etc.
> 
> Chung-Lin
>
Andrew Stubbs May 20, 2011, 10:37 a.m. UTC | #2
On 20/05/11 10:45, Dmitry Plotnikov wrote:
> This patch adds support for 12-bit immediate values for Thumb-2 in ADD and
> SUB instructions.  We added two new alternatives for *arm_addsi3 which
> make use of two new constraints for 12-bit values.  Also we modified
> costs of PLUS rtx expression.

A already have a patch submitted for review that does all this.

http://gcc.gnu.org/ml/gcc-patches/2011-04/msg01657.html

> This patch reduces size of libevas by 1788 bytes (from 464916 to
> 463128), and sqlite by 54 bytes (from 266156 to 266052).
> Regtested with Cortex-A8 QEMU.

I also have another patch that improves support for thumb2 replicated 
constants. I'd be interested whether the patch makes a difference to 
your benchmark?

http://gcc.gnu.org/ml/gcc-patches/2011-04/msg01757.html (approved, but 
blocked on the addw/subw review).

> +/* Return TRUE if int I is a valid immediate THUMB-2 constant in add/sub
> + *  instructions in 12 bit encoding variants.  */
> +
> +int
> +const_ok_for_thumb2_12bit (HOST_WIDE_INT i)
> +{
> +   /* According to Thumb-2 instruction set manual such constants should be
> +   *  in range 0-4095.  */
> +  if ((i&  ~(unsigned HOST_WIDE_INT) 0xfff) == 0)
> +    return TRUE;
> +
> +  return FALSE;
> +}

As Richard and Chung-Lin said, we have const_ok_for_op for this sort of 
thing. I already patched it for movw 16-bit constants, and my patch 
above covers addw and subw.

> @@ -7164,7 +7179,10 @@ arm_rtx_costs_1 (rtx x, enum rtx_code outer, int* total, bool speed)
>
>       case CONST_INT:
>         if (const_ok_for_arm (INTVAL (x))
> -	  || const_ok_for_arm (~INTVAL (x)))
> +	  || const_ok_for_arm (~INTVAL (x))
> +	  || (TARGET_THUMB2&&  (outer == PLUS)
> +	&&  (const_ok_for_thumb2_12bit (INTVAL (x))
> +	          || const_ok_for_thumb2_12bit (~INTVAL (x)))))
>   	*total = COSTS_N_INSNS (1);
>         else
>   	*total = COSTS_N_INSNS (arm_gen_constant (SET, mode, NULL_RTX,

I think my patch may be missing this bit - good catch. It really should 
be recast in terms of const_ok_for_op, though.

> +   add%?\\t%0, %1, %2
> +   sub%?\\t%0, %1, #%n2

This looks wrong. add and sub only support 8-bit constants. You'd need 
addw and subw here.

Andrew
Chung-Lin Tang May 20, 2011, 11:32 a.m. UTC | #3
On 2011/5/21 12:06 AM, Chung-Lin Tang wrote:
> I'll also note here that ADD/SUB are not the only instructions with
> 12-bit immediate under Thumb-2; so does AND, ORR, etc.

FTR, I was wrong on the above statement. Only add/sub seems to have the
wide constant operands.

CL
Chung-Lin Tang May 20, 2011, 4:06 p.m. UTC | #4
On 2011/5/20 11:45 AM, Dmitry Plotnikov wrote:
> This patch adds support for 12-bit immediate values for Thumb-2 in ADD and
> SUB instructions.  We added two new alternatives for *arm_addsi3 which
> make use of two new constraints for 12-bit values.  Also we modified
> costs of PLUS rtx expression.
> This patch reduces size of libevas by 1788 bytes (from 464916 to
> 463128), and sqlite by 54 bytes (from 266156 to 266052).
> Regtested with Cortex-A8 QEMU.
> 
> Ok for trunk?
> 

Andrew Stubbs seem to have another patch related to ADDW/SUBW support,
which I think is not yet committed to trunk. I have not yet studied how
this patch and Andrew's relate.

That aside, I think the style of adding new alternatives for this
purpose is a little unneeded. I suggest:

1) Abstract out const_ok_for_arm() into const_ok_for_arm_outer() with an
OUTER rtx code argument, and a const_ok_for_arm() with OUTER passed 0.

2) Within const_ok_for_arm_outer(), test for OUTER==PLUS and
TARGET_THUMB2 as needed.

3) Migrate from const_ok_for_arm() to const_ok_for_arm_outer() as
needed: in pattern conditions, etc.

I'll also note here that ADD/SUB are not the only instructions with
12-bit immediate under Thumb-2; so does AND, ORR, etc.

Chung-Lin
Chung-Lin Tang May 20, 2011, 4:15 p.m. UTC | #5
On 2011/5/20 下午 12:10, Richard Earnshaw wrote:
> 
> On Fri, 2011-05-20 at 18:06 +0200, Chung-Lin Tang wrote:
>> On 2011/5/20 11:45 AM, Dmitry Plotnikov wrote:
>>> This patch adds support for 12-bit immediate values for Thumb-2 in ADD and
>>> SUB instructions.  We added two new alternatives for *arm_addsi3 which
>>> make use of two new constraints for 12-bit values.  Also we modified
>>> costs of PLUS rtx expression.
>>> This patch reduces size of libevas by 1788 bytes (from 464916 to
>>> 463128), and sqlite by 54 bytes (from 266156 to 266052).
>>> Regtested with Cortex-A8 QEMU.
>>>
>>> Ok for trunk?
>>>
>>
>> Andrew Stubbs seem to have another patch related to ADDW/SUBW support,
>> which I think is not yet committed to trunk. I have not yet studied how
>> this patch and Andrew's relate.
>>
>> That aside, I think the style of adding new alternatives for this
>> purpose is a little unneeded. I suggest:
>>
>> 1) Abstract out const_ok_for_arm() into const_ok_for_arm_outer() with an
>> OUTER rtx code argument, and a const_ok_for_arm() with OUTER passed 0.
>>
>> 2) Within const_ok_for_arm_outer(), test for OUTER==PLUS and
>> TARGET_THUMB2 as needed.
>>
>> 3) Migrate from const_ok_for_arm() to const_ok_for_arm_outer() as
>> needed: in pattern conditions, etc.
> 
> We already have const_ok_for_op, which already does this sort of thing.
> 
> R.

I see now. Andrew's patch seems to be exactly adding this kind of
constant support for const_ok_for_op().

CL
Dmitry Plotnikov May 31, 2011, 3:27 p.m. UTC | #6
On 05/20/2011 02:37 PM, Andrew Stubbs wrote:
> On 20/05/11 10:45, Dmitry Plotnikov wrote:
>> This patch adds support for 12-bit immediate values for Thumb-2 in 
>> ADD and
>> SUB instructions.  We added two new alternatives for *arm_addsi3 which
>> make use of two new constraints for 12-bit values.  Also we modified
>> costs of PLUS rtx expression.
>
> A already have a patch submitted for review that does all this.
>
> http://gcc.gnu.org/ml/gcc-patches/2011-04/msg01657.html
Sorry, we missed this message.
>
>> This patch reduces size of libevas by 1788 bytes (from 464916 to
>> 463128), and sqlite by 54 bytes (from 266156 to 266052).
>> Regtested with Cortex-A8 QEMU.
>
> I also have another patch that improves support for thumb2 replicated 
> constants. I'd be interested whether the patch makes a difference to 
> your benchmark?
>
> http://gcc.gnu.org/ml/gcc-patches/2011-04/msg01757.html (approved, but 
> blocked on the addw/subw review).
I tested the patches on libevas.  Code size is reduced by 64 bytes with 
your addw/subw patch (this effect is actually similar to our patch; 
there was an error  with numbers reported in the previous message).  
Adding your replicated constants patch reduces code size additionally by 
808 bytes.  These patches didn't affect the performance on libevas.
>
>> +/* Return TRUE if int I is a valid immediate THUMB-2 constant in 
>> add/sub
>> + *  instructions in 12 bit encoding variants.  */
>> +
>> +int
>> +const_ok_for_thumb2_12bit (HOST_WIDE_INT i)
>> +{
>> +   /* According to Thumb-2 instruction set manual such constants 
>> should be
>> +   *  in range 0-4095.  */
>> +  if ((i&  ~(unsigned HOST_WIDE_INT) 0xfff) == 0)
>> +    return TRUE;
>> +
>> +  return FALSE;
>> +}
>
> As Richard and Chung-Lin said, we have const_ok_for_op for this sort 
> of thing. I already patched it for movw 16-bit constants, and my patch 
> above covers addw and subw.
>
>> @@ -7164,7 +7179,10 @@ arm_rtx_costs_1 (rtx x, enum rtx_code outer, 
>> int* total, bool speed)
>>
>>       case CONST_INT:
>>         if (const_ok_for_arm (INTVAL (x))
>> -      || const_ok_for_arm (~INTVAL (x)))
>> +      || const_ok_for_arm (~INTVAL (x))
>> +      || (TARGET_THUMB2&&  (outer == PLUS)
>> + &&  (const_ok_for_thumb2_12bit (INTVAL (x))
>> +              || const_ok_for_thumb2_12bit (~INTVAL (x)))))
>>       *total = COSTS_N_INSNS (1);
>>         else
>>       *total = COSTS_N_INSNS (arm_gen_constant (SET, mode, NULL_RTX,
>
> I think my patch may be missing this bit - good catch. It really 
> should be recast in terms of const_ok_for_op, though.
>
Would you include this in your patch? Or should we submit it as a 
separate patch?
Andrew Stubbs June 1, 2011, 8:56 a.m. UTC | #7
On 31/05/11 16:27, Dmitry Plotnikov wrote:
> Would you include this in your patch? Or should we submit it as a
> separate patch?

I'm not sure I *can* commit your patches, legally speaking, although 
this one is small enough that probably it's ok ... probably.

Perhaps you should submit it yourself, once mine has gone in (assuming 
it ever gets reviewed ... Richard) and then you can take all the glory 
you're due! ;)

Andrew
Ramana Radhakrishnan June 2, 2011, 8:30 a.m. UTC | #8
> Would you include this in your patch? Or should we submit it as a
> separate patch?


Could you submit this as a follow-up patch that touches the costs. I 
would rather that these changes also went in when we were looking at 
this area ?

cheers
Ramana
>
diff mbox

Patch

gcc/config/arm/

2011-05-19  Dmitry Plotnikov  <dplotnikov@ispras.ru>

	* arm-protos.h (const_ok_for_thumb2_12bit): New prototype.

	* arm.c (const_ok_for_thumb2_12bit): New function.
	(arm_rtx_costs_1): Modified costs for PLUS rtx.

	* arm.md (*arm_addsi3): New variants.
	
	* constraints.md (Pn): New constraint.
	(Pq): New constraint.

gcc/testsuite/

2010-05-19  Dmitry Plotnikov  <dplotnikov@ispras.ru>

	* gcc.target/arm/thumb2-addw.c: New testcase.
	* gcc.target/arm/thumb2-subw.c: New testcase.

diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 190fec0..aaebdba 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -46,6 +46,7 @@  extern bool arm_vector_mode_supported_p (enum machine_mode);
 extern bool arm_small_register_classes_for_mode_p (enum machine_mode);
 extern int arm_hard_regno_mode_ok (unsigned int, enum machine_mode);
 extern int const_ok_for_arm (HOST_WIDE_INT);
+extern int const_ok_for_thumb2_12bit (HOST_WIDE_INT);
 extern int arm_split_constant (RTX_CODE, enum machine_mode, rtx,
 			       HOST_WIDE_INT, rtx, rtx, int);
 extern RTX_CODE arm_canonicalize_comparison (RTX_CODE, rtx *, rtx *);
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 5f964d6..db2e5f2 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -2298,6 +2298,21 @@  const_ok_for_arm (HOST_WIDE_INT i)
   return FALSE;
 }
 
+/* Return TRUE if int I is a valid immediate THUMB-2 constant in add/sub 
+ *  instructions in 12 bit encoding variants.  */
+
+int
+const_ok_for_thumb2_12bit (HOST_WIDE_INT i) 
+{
+   /* According to Thumb-2 instruction set manual such constants should be 
+   *  in range 0-4095.  */
+  if ((i & ~(unsigned HOST_WIDE_INT) 0xfff) == 0)
+    return TRUE;
+
+  return FALSE;
+}
+
+
 /* Return true if I is a valid constant for the operation CODE.  */
 static int
 const_ok_for_op (HOST_WIDE_INT i, enum rtx_code code)
@@ -7164,7 +7179,10 @@  arm_rtx_costs_1 (rtx x, enum rtx_code outer, int* total, bool speed)
 
     case CONST_INT:
       if (const_ok_for_arm (INTVAL (x))
-	  || const_ok_for_arm (~INTVAL (x)))
+	  || const_ok_for_arm (~INTVAL (x))
+	  || (TARGET_THUMB2 && (outer == PLUS) 
+	      && (const_ok_for_thumb2_12bit (INTVAL (x))
+	          || const_ok_for_thumb2_12bit (~INTVAL (x)))))
 	*total = COSTS_N_INSNS (1);
       else
 	*total = COSTS_N_INSNS (arm_gen_constant (SET, mode, NULL_RTX,
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 9e23d9b..0a92b0a 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -707,21 +707,26 @@ 
 ;;  (plus (reg rN) (reg sp)) into (reg rN).  In this case reload will
 ;; put the duplicated register first, and not try the commutative version.
 (define_insn_and_split "*arm_addsi3"
-  [(set (match_operand:SI          0 "s_register_operand" "=r, k,r,r, k,r")
-	(plus:SI (match_operand:SI 1 "s_register_operand" "%rk,k,r,rk,k,rk")
-		 (match_operand:SI 2 "reg_or_int_operand" "rI,rI,k,L, L,?n")))]
+  [(set (match_operand:SI          0 "s_register_operand" "=r ,k, r,r, r, r, k,r")
+	(plus:SI (match_operand:SI 1 "s_register_operand" "%rk,k, r,rk, rk, rk,k,rk")
+		 (match_operand:SI 2 "reg_or_int_operand" "rI ,rI,k,Pq,Pn,L, L,?n")))]
   "TARGET_32BIT"
   "@
    add%?\\t%0, %1, %2
    add%?\\t%0, %1, %2
    add%?\\t%0, %2, %1
+   add%?\\t%0, %1, %2
+   sub%?\\t%0, %1, #%n2
    sub%?\\t%0, %1, #%n2
    sub%?\\t%0, %1, #%n2
    #"
   "TARGET_32BIT
    && GET_CODE (operands[2]) == CONST_INT
    && !(const_ok_for_arm (INTVAL (operands[2]))
-        || const_ok_for_arm (-INTVAL (operands[2])))
+        || const_ok_for_arm (-INTVAL (operands[2]))
+        || (TARGET_THUMB2 
+            && (const_ok_for_thumb2_12bit (INTVAL (operands[2]))
+                || const_ok_for_thumb2_12bit (-INTVAL (operands[2])))))
    && (reload_completed || !arm_eliminable_register (operands[1]))"
   [(clobber (const_int 0))]
   "
@@ -730,7 +735,7 @@ 
 		      operands[1], 0);
   DONE;
   "
-  [(set_attr "length" "4,4,4,4,4,16")
+  [(set_attr "length" "4,4,4,4,4,4,4,16")
    (set_attr "predicable" "yes")]
 )
 
diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md
index f5b8521..71b7130 100644
--- a/gcc/config/arm/constraints.md
+++ b/gcc/config/arm/constraints.md
@@ -160,6 +160,16 @@ 
   (and (match_code "const_int")
        (match_test "TARGET_THUMB1 && ival >= 0 && ival <= 7")))
 
+(define_constraint "Pn"
+  "@internal In Thumb-2 state a constant in the range -4095 to 0"
+  (and (match_code "const_int")
+       (match_test "TARGET_THUMB2 && ival >= -4094 && ival <= 0")))
+
+(define_constraint "Pq"
+  "@internal In Thumb-2 state a constant in the range 0 to 4095"
+  (and (match_code "const_int")
+       (match_test "TARGET_THUMB2 && ival >= 0 && ival <= 4095")))
+
 (define_constraint "Ps"
   "@internal In Thumb-2 state a constant in the range -255 to +255"
   (and (match_code "const_int")
diff --git a/gcc/testsuite/gcc.target/arm/thumb2-addw.c b/gcc/testsuite/gcc.target/arm/thumb2-addw.c
new file mode 100644
index 0000000..e84dc69
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/thumb2-addw.c
@@ -0,0 +1,10 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_thumb2_ok} */
+/* { dg-options "-O2 -mthumb" } */
+/* { dg-final { scan-assembler "add.*#3565" } } */
+
+/* Verify that 12-bit immediate is used.  */
+int f1(int n) {
+       return n + 0xded;
+}
+
diff --git a/gcc/testsuite/gcc.target/arm/thumb2-subw.c b/gcc/testsuite/gcc.target/arm/thumb2-subw.c
new file mode 100644
index 0000000..78f5b09
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/thumb2-subw.c
@@ -0,0 +1,10 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_thumb2_ok} */
+/* { dg-options "-O2 -mthumb" } */
+/* { dg-final { scan-assembler "sub.*#3565" } } */
+
+/* Verify that 12-bit immediate is used.  */
+int f1(int n) {
+       return n - 0xded;
+}
+