diff mbox

[ARM] Add support for overflow add, sub, and neg operations

Message ID 56F9BC1C.50603@linaro.org
State New
Headers show

Commit Message

Michael Collison March 28, 2016, 11:19 p.m. UTC
An updated patch that resolves the issues with thumb2 support and adds 
test cases as requested. Looking to check this into GCC 7 stage1 when it 
opens.

2016-02-24  Michael Collison  <michael.collison@arm.com>

     * config/arm/arm-modes.def: Add new condition code mode CC_V
     to represent the overflow bit.
     * config/arm/arm.c (maybe_get_arm_condition_code):
     Add support for CC_Vmode.
     * config/arm/arm.md (addv<mode>4, add<mode>3_compareV,
     addsi3_compareV_upper): New patterns to support signed
     builtin overflow add operations.
     (uaddv<mode>4, add<mode>3_compareC, addsi3_compareV_upper):
     New patterns to support unsigned builtin add overflow operations.
     (subv<mode>4, sub<mode>3_compare1): New patterns to support signed
     builtin overflow subtract operations,
     (usubv<mode>4): New patterns to support unsigned builtin subtract
     overflow operations.
     (negvsi3, negvdi3, negdi2_compare, negsi2_carryin_compare): New 
patterns
     to support builtin overflow negate operations.
     * gcc.target/arm/builtin_saddl.c: New testcase.
     * gcc.target/arm/builtin_saddll.c: New testcase.
     * gcc.target/arm/builtin_uaddl.c: New testcase.
     * gcc.target/arm/builtin_uaddll.c: New testcase.
     * gcc.target/arm/builtin_ssubl.c: New testcase.
     * gcc.target/arm/builtin_ssubll.c: New testcase.
     * gcc.target/arm/builtin_usubl.c: New testcase.
     * gcc.target/arm/builtin_usubll.c: New testcase.

On 02/29/2016 04:13 AM, Kyrill Tkachov wrote:
>
> On 26/02/16 10:32, Michael Collison wrote:
>>
>>
>> On 02/25/2016 02:51 AM, Kyrill Tkachov wrote:
>>> Hi Michael,
>>>
>>> On 24/02/16 23:02, Michael Collison wrote:
>>>> This patch adds support for builtin overflow of add, subtract and 
>>>> negate. This patch is targeted for gcc 7 stage 1. It was tested 
>>>> with no regressions in arm and thumb modes on the following targets:
>>>>
>>>> arm-non-linux-gnueabi
>>>> arm-non-linux-gnuabihf
>>>> armeb-none-linux-gnuabihf
>>>> arm-non-eabi
>>>>
>>>
>>> I'll have a deeper look once we're closer to GCC 7 development.
>>> I've got a few comments in the meantime.
>>>
>>>> 2016-02-24 Michael Collison <michael.collison@arm.com>
>>>>
>>>>     * config/arm/arm-modes.def: Add new condition code mode CC_V
>>>>     to represent the overflow bit.
>>>>     * config/arm/arm.c (maybe_get_arm_condition_code):
>>>>     Add support for CC_Vmode.
>>>>     * config/arm/arm.md (addv<mode>4, add<mode>3_compareV,
>>>>     addsi3_compareV_upper): New patterns to support signed
>>>>     builtin overflow add operations.
>>>>     (uaddv<mode>4, add<mode>3_compareC, addsi3_compareV_upper):
>>>>     New patterns to support unsigned builtin add overflow operations.
>>>>     (subv<mode>4, sub<mode>3_compare1): New patterns to support signed
>>>>     builtin overflow subtract operations,
>>>>     (usubv<mode>4): New patterns to support unsigned builtin subtract
>>>>     overflow operations.
>>>>     (negvsi3, negvdi3, negdi2_compre, negsi2_carryin_compare): New 
>>>> patterns
>>>>     to support builtin overflow negate operations.
>>>>
>>>>
>>>
>>> Can you please summarise what sequences are generated for these 
>>> operations, and how
>>> they are better than the default fallback sequences.
>>
>> Sure for a simple test case such as:
>>
>> int
>> fn3 (int x, int y, int *ovf)
>> {
>>   int res;
>>   *ovf = __builtin_sadd_overflow (x, y, &res);
>>   return res;
>> }
>>
>> Current trunk at -O2 generates
>>
>> fn3:
>>         @ args = 0, pretend = 0, frame = 0
>>         @ frame_needed = 0, uses_anonymous_args = 0
>>         @ link register save eliminated.
>>         cmp     r1, #0
>>         mov     r3, #0
>>         add     r1, r0, r1
>>         blt     .L4
>>         cmp     r1, r0
>>         blt     .L3
>> .L2:
>>         str     r3, [r2]
>>         mov     r0, r1
>>         bx      lr
>> .L4:
>>         cmp     r1, r0
>>         ble     .L2
>> .L3:
>>         mov     r3, #1
>>         b       .L2
>>
>> With the overflow patch this now generates:
>>
>>        adds    r0, r0, r1
>>        movvs   r3, #1
>>        movvc   r3, #0
>>        str     r3, [r2]
>>        bx      lr
>>
>
> Thanks! That looks much better.
>
>>> Also, we'd need tests for each of these overflow operations, since 
>>> these are pretty complex
>>> patterns that are being added.
>>
>> The patterns are tested now most notably by tests in:
>>
>> c-c++-common/torture/builtin-arith-overflow*.c
>>
>> I had a few failures I resolved so the builtin overflow arithmetic 
>> functions are definitely being exercised.
>
> Great, that gives me more confidence on the correctness aspects but...
>
>>>
>>> Also, you may want to consider splitting this into a patch series, 
>>> each adding a single
>>> overflow operation, together with its tests. That way it will be 
>>> easier to keep track of
>>> which pattern applies to which use case and they can go in 
>>> independently of each other.
>>
>> Let me know if you still fell the same way given the existing test 
>> cases.
>>
>
> ... I'd like us to still have scan-assembler tests. The torture tests 
> exercise the correctness,
> but we'd want tests to catch regressions where we stop generating the 
> new patterns due to other
> optimisation changes, which would lead to code quality regressions.
> So I'd like us to have scan-assembler tests for these sequences to 
> make sure we generate the right
> instructions.
>
> Thanks,
> Kyrill
>
>>>
>>> +(define_expand "uaddv<mode>4"
>>> +  [(match_operand:SIDI 0 "register_operand")
>>> +   (match_operand:SIDI 1 "register_operand")
>>> +   (match_operand:SIDI 2 "register_operand")
>>> +   (match_operand 3 "")]
>>> +  "TARGET_ARM"
>>> +{
>>> +  emit_insn (gen_add<mode>3_compareC (operands[0], operands[1], 
>>> operands[2]));
>>> +
>>> +  rtx x;
>>> +  x = gen_rtx_NE (VOIDmode, gen_rtx_REG (CC_Cmode, CC_REGNUM), 
>>> const0_rtx);
>>> +  x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
>>> +                gen_rtx_LABEL_REF (VOIDmode, operands[3]),
>>> +                pc_rtx);
>>> +  emit_jump_insn (gen_rtx_SET (pc_rtx, x));
>>> +  DONE;
>>> +})
>>> +
>>>
>>> I notice this and many other patterns in this patch are guarded on 
>>> TARGET_ARM. Is there any reason why they
>>> should be restricted to arm state and not be TARGET_32BIT ?
>> I thought about this as well. I will test will TARGET_32BIT and get 
>> back to you.
>>>
>>>
>>> Thanks,
>>> Kyrill
>>
>
diff mbox

Patch

diff --git a/gcc/config/arm/arm-modes.def b/gcc/config/arm/arm-modes.def
index 1819553..69231f2 100644
--- a/gcc/config/arm/arm-modes.def
+++ b/gcc/config/arm/arm-modes.def
@@ -59,6 +59,7 @@  CC_MODE (CC_DGEU);
 CC_MODE (CC_DGTU);
 CC_MODE (CC_C);
 CC_MODE (CC_N);
+CC_MODE (CC_V);
 
 /* Vector modes.  */
 VECTOR_MODES (INT, 4);        /*            V4QI V2HI */
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index c868490..406b306 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -22906,6 +22906,8 @@  maybe_get_arm_condition_code (rtx comparison)
 	{
 	case LTU: return ARM_CS;
 	case GEU: return ARM_CC;
+	case NE: return ARM_CS;
+	case EQ: return ARM_CC;
 	default: return ARM_NV;
 	}
 
@@ -22931,6 +22933,15 @@  maybe_get_arm_condition_code (rtx comparison)
 	default: return ARM_NV;
 	}
 
+    case CC_Vmode:
+      switch (comp_code)
+	{
+	case NE: return ARM_VS;
+	case EQ: return ARM_VC;
+	default: return ARM_NV;
+
+	}
+
     case CCmode:
       switch (comp_code)
 	{
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 47171b9..812f082 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -539,6 +539,42 @@ 
    (set_attr "type" "multiple")]
 )
 
+(define_expand "addv<mode>4"
+  [(match_operand:SIDI 0 "register_operand")
+   (match_operand:SIDI 1 "register_operand")
+   (match_operand:SIDI 2 "register_operand")
+   (match_operand 3 "")]
+  "TARGET_32BIT"
+{
+  emit_insn (gen_add<mode>3_compareV (operands[0], operands[1], operands[2]));
+
+  rtx x;
+  x = gen_rtx_NE (VOIDmode, gen_rtx_REG (CC_Vmode, CC_REGNUM), const0_rtx);
+  x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
+			    gen_rtx_LABEL_REF (VOIDmode, operands[3]),
+			    pc_rtx);
+  emit_jump_insn (gen_rtx_SET (pc_rtx, x));
+  DONE;
+})
+
+(define_expand "uaddv<mode>4"
+  [(match_operand:SIDI 0 "register_operand")
+   (match_operand:SIDI 1 "register_operand")
+   (match_operand:SIDI 2 "register_operand")
+   (match_operand 3 "")]
+  "TARGET_32BIT"
+{
+  emit_insn (gen_add<mode>3_compareC (operands[0], operands[1], operands[2]));
+
+  rtx x;
+  x = gen_rtx_NE (VOIDmode, gen_rtx_REG (CC_Cmode, CC_REGNUM), const0_rtx);
+  x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
+			    gen_rtx_LABEL_REF (VOIDmode, operands[3]),
+			    pc_rtx);
+  emit_jump_insn (gen_rtx_SET (pc_rtx, x));
+  DONE;
+})
+
 (define_expand "addsi3"
   [(set (match_operand:SI          0 "s_register_operand" "")
 	(plus:SI (match_operand:SI 1 "s_register_operand" "")
@@ -616,6 +652,165 @@ 
  ]
 )
 
+(define_insn_and_split "adddi3_compareV"
+  [(set (reg:CC_V CC_REGNUM)
+	(ne:CC_V
+	  (plus:TI
+	    (sign_extend:TI (match_operand:DI 1 "register_operand" "r"))
+	    (sign_extend:TI (match_operand:DI 2 "register_operand" "r")))
+	  (sign_extend:TI (plus:DI (match_dup 1) (match_dup 2)))))
+   (set (match_operand:DI 0 "register_operand" "=&r")
+	(plus:DI (match_dup 1) (match_dup 2)))]
+  "TARGET_32BIT"
+  "#"
+  "TARGET_32BIT && reload_completed"
+  [(parallel [(set (reg:CC_C CC_REGNUM)
+		   (compare:CC_C (plus:SI (match_dup 1) (match_dup 2))
+				 (match_dup 1)))
+	      (set (match_dup 0) (plus:SI (match_dup 1) (match_dup 2)))])
+   (parallel [(set (reg:CC_V CC_REGNUM)
+		   (ne:CC_V
+		    (plus:DI (plus:DI
+			      (sign_extend:DI (match_dup 4))
+			      (sign_extend:DI (match_dup 5)))
+			     (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)))
+		    (plus:DI (sign_extend:DI
+			      (plus:SI (match_dup 4) (match_dup 5)))
+			     (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)))))
+	     (set (match_dup 3) (plus:SI (plus:SI
+					  (match_dup 4) (match_dup 5))
+					 (ltu:SI (reg:CC_C CC_REGNUM)
+						 (const_int 0))))])]
+  "
+  {
+    operands[3] = gen_highpart (SImode, operands[0]);
+    operands[0] = gen_lowpart (SImode, operands[0]);
+    operands[4] = gen_highpart (SImode, operands[1]);
+    operands[1] = gen_lowpart (SImode, operands[1]);
+    operands[5] = gen_highpart (SImode, operands[2]);
+    operands[2] = gen_lowpart (SImode, operands[2]);
+  }"
+ [(set_attr "conds" "set")
+   (set_attr "length" "8")
+   (set_attr "type" "multiple")]
+)
+
+(define_insn "addsi3_compareV"
+  [(set (reg:CC_V CC_REGNUM)
+	(ne:CC_V
+	  (plus:DI
+	    (sign_extend:DI (match_operand:SI 1 "register_operand" "r"))
+	    (sign_extend:DI (match_operand:SI 2 "register_operand" "r")))
+	  (sign_extend:DI (plus:SI (match_dup 1) (match_dup 2)))))
+   (set (match_operand:SI 0 "register_operand" "=r")
+	(plus:SI (match_dup 1) (match_dup 2)))]
+  "TARGET_32BIT"
+  "adds%?\\t%0, %1, %2"
+  [(set_attr "conds" "set")
+   (set_attr "type" "alus_sreg")]
+)
+
+(define_insn "*addsi3_compareV_upper"
+  [(set (reg:CC_V CC_REGNUM)
+	(ne:CC_V
+	  (plus:DI
+	   (plus:DI
+	    (sign_extend:DI (match_operand:SI 1 "register_operand" "r"))
+	    (sign_extend:DI (match_operand:SI 2 "register_operand" "r")))
+	   (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)))
+	  (plus:DI (sign_extend:DI
+		    (plus:SI (match_dup 1) (match_dup 2)))
+		   (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)))))
+   (set (match_operand:SI 0 "register_operand" "=r")
+	(plus:SI
+	 (plus:SI (match_dup 1) (match_dup 2))
+	 (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))]
+  "TARGET_32BIT"
+  "adcs%?\\t%0, %1, %2"
+  [(set_attr "conds" "set")
+   (set_attr "type" "adcs_reg")]
+)
+
+(define_insn_and_split "adddi3_compareC"
+  [(set (reg:CC_C CC_REGNUM)
+	(ne:CC_C
+	  (plus:TI
+	    (zero_extend:TI (match_operand:DI 1 "register_operand" "r"))
+	    (zero_extend:TI (match_operand:DI 2 "register_operand" "r")))
+	  (zero_extend:TI (plus:DI (match_dup 1) (match_dup 2)))))
+   (set (match_operand:DI 0 "register_operand" "=&r")
+	(plus:DI (match_dup 1) (match_dup 2)))]
+  "TARGET_32BIT"
+  "#"
+  "TARGET_32BIT && reload_completed"
+  [(parallel [(set (reg:CC_C CC_REGNUM)
+		   (compare:CC_C (plus:SI (match_dup 1) (match_dup 2))
+				 (match_dup 1)))
+	      (set (match_dup 0) (plus:SI (match_dup 1) (match_dup 2)))])
+   (parallel [(set (reg:CC_C CC_REGNUM)
+		   (ne:CC_C
+		    (plus:DI (plus:DI
+			      (zero_extend:DI (match_dup 4))
+			      (zero_extend:DI (match_dup 5)))
+			     (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)))
+		    (plus:DI (zero_extend:DI
+			      (plus:SI (match_dup 4) (match_dup 5)))
+			     (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)))))
+	     (set (match_dup 3) (plus:SI
+				 (plus:SI (match_dup 4) (match_dup 5))
+				 (ltu:SI (reg:CC_C CC_REGNUM)
+					 (const_int 0))))])]
+  "
+  {
+    operands[3] = gen_highpart (SImode, operands[0]);
+    operands[0] = gen_lowpart (SImode, operands[0]);
+    operands[4] = gen_highpart (SImode, operands[1]);
+    operands[5] = gen_highpart (SImode, operands[2]);
+    operands[1] = gen_lowpart (SImode, operands[1]);
+    operands[2] = gen_lowpart (SImode, operands[2]);
+  }"
+ [(set_attr "conds" "set")
+   (set_attr "length" "8")
+   (set_attr "type" "multiple")]
+)
+
+(define_insn "*addsi3_compareC_upper"
+  [(set (reg:CC_C CC_REGNUM)
+	(ne:CC_C
+	  (plus:DI
+	   (plus:DI
+	    (zero_extend:DI (match_operand:SI 1 "register_operand" "r"))
+	    (zero_extend:DI (match_operand:SI 2 "register_operand" "r")))
+	   (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)))
+	  (plus:DI (zero_extend:DI
+		    (plus:SI (match_dup 1) (match_dup 2)))
+		   (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)))))
+   (set (match_operand:SI 0 "register_operand" "=r")
+	(plus:SI
+	 (plus:SI (match_dup 1) (match_dup 2))
+	 (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))]
+  "TARGET_32BIT"
+  "adcs%?\\t%0, %1, %2"
+  [(set_attr "conds" "set")
+   (set_attr "type" "adcs_reg")]
+)
+
+(define_insn "addsi3_compareC"
+   [(set (reg:CC_C CC_REGNUM)
+	 (ne:CC_C
+	  (plus:DI
+	   (zero_extend:DI (match_operand:SI 1 "register_operand" "r"))
+	   (zero_extend:DI (match_operand:SI 2 "register_operand" "r")))
+	  (zero_extend:DI
+	   (plus:SI (match_dup 1) (match_dup 2)))))
+    (set (match_operand:SI 0 "register_operand" "=r")
+	 (plus:SI (match_dup 1) (match_dup 2)))]
+  "TARGET_32BIT"
+  "adds%?\\t%0, %1, %2"
+  [(set_attr "conds" "set")
+   (set_attr "type" "alus_sreg")]
+)
+
 (define_insn "addsi3_compare0"
   [(set (reg:CC_NOOV CC_REGNUM)
 	(compare:CC_NOOV
@@ -865,6 +1060,85 @@ 
     (set_attr "type" "adcs_reg")]
 )
 
+(define_expand "subv<mode>4"
+  [(match_operand:SIDI 0 "register_operand")
+   (match_operand:SIDI 1 "register_operand")
+   (match_operand:SIDI 2 "register_operand")
+   (match_operand 3 "")]
+  "TARGET_32BIT"
+{
+  emit_insn (gen_sub<mode>3_compare1 (operands[0], operands[1], operands[2]));
+
+  rtx x;
+  x = gen_rtx_NE (VOIDmode, gen_rtx_REG (CC_Vmode, CC_REGNUM), const0_rtx);
+  x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
+			    gen_rtx_LABEL_REF (VOIDmode, operands[3]),
+			    pc_rtx);
+  emit_jump_insn (gen_rtx_SET (pc_rtx, x));
+  DONE;
+})
+
+(define_expand "usubv<mode>4"
+  [(match_operand:SIDI 0 "register_operand")
+   (match_operand:SIDI 1 "register_operand")
+   (match_operand:SIDI 2 "register_operand")
+   (match_operand 3 "")]
+  "TARGET_32BIT"
+{
+  emit_insn (gen_sub<mode>3_compare1 (operands[0], operands[1], operands[2]));
+
+  rtx x;
+  x = gen_rtx_LTU (VOIDmode, gen_rtx_REG (CCmode, CC_REGNUM), const0_rtx);
+  x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
+			    gen_rtx_LABEL_REF (VOIDmode, operands[3]),
+			    pc_rtx);
+  emit_jump_insn (gen_rtx_SET (pc_rtx, x));
+  DONE;
+})
+
+(define_insn_and_split "subdi3_compare1"
+  [(set (reg:CC CC_REGNUM)
+	(compare:CC
+	  (match_operand:DI 1 "register_operand" "r")
+	  (match_operand:DI 2 "register_operand" "r")))
+   (set (match_operand:DI 0 "register_operand" "=&r")
+	(minus:DI (match_dup 1) (match_dup 2)))]
+  "TARGET_32BIT"
+  "#"
+  "TARGET_32BIT && reload_completed"
+  [(parallel [(set (reg:CC CC_REGNUM)
+		   (compare:CC (match_dup 1) (match_dup 2)))
+	      (set (match_dup 0) (minus:SI (match_dup 1) (match_dup 2)))])
+   (parallel [(set (reg:CC CC_REGNUM)
+		   (compare:CC (match_dup 4) (match_dup 5)))
+	     (set (match_dup 3) (minus:SI (minus:SI (match_dup 4) (match_dup 5))
+			       (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))])]
+  {
+    operands[3] = gen_highpart (SImode, operands[0]);
+    operands[0] = gen_lowpart (SImode, operands[0]);
+    operands[4] = gen_highpart (SImode, operands[1]);
+    operands[1] = gen_lowpart (SImode, operands[1]);
+    operands[5] = gen_highpart (SImode, operands[2]);
+    operands[2] = gen_lowpart (SImode, operands[2]);
+   }
+  [(set_attr "conds" "set")
+   (set_attr "length" "8")
+   (set_attr "type" "multiple")]
+)
+
+(define_insn "subsi3_compare1"
+  [(set (reg:CC CC_REGNUM)
+	(compare:CC
+	  (match_operand:SI 1 "register_operand" "r")
+	  (match_operand:SI 2 "register_operand" "r")))
+   (set (match_operand:SI 0 "register_operand" "=r")
+	(minus:SI (match_dup 1) (match_dup 2)))]
+  "TARGET_32BIT"
+  "subs%?\\t%0, %1, %2"
+  [(set_attr "conds" "set")
+   (set_attr "type" "alus_sreg")]
+)
+
 (define_insn "*subsi3_carryin"
   [(set (match_operand:SI 0 "s_register_operand" "=r,r")
         (minus:SI (minus:SI (match_operand:SI 1 "reg_or_int_operand" "r,I")
@@ -4349,6 +4623,74 @@ 
 
 ;; Unary arithmetic insns
 
+(define_expand "negvsi3"
+  [(match_operand:SI 0 "register_operand")
+   (match_operand:SI 1 "register_operand")
+   (match_operand 2 "")]
+  "TARGET_32BIT"
+{
+  emit_insn (gen_subsi3_compare (operands[0], const0_rtx, operands[1]));
+
+  rtx x;
+  x = gen_rtx_NE (VOIDmode, gen_rtx_REG (CC_Vmode, CC_REGNUM), const0_rtx);
+  x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
+			    gen_rtx_LABEL_REF (VOIDmode, operands[2]),
+			    pc_rtx);
+  emit_jump_insn (gen_rtx_SET (pc_rtx, x));
+  DONE;
+})
+
+(define_expand "negvdi3"
+  [(match_operand:DI 0 "register_operand")
+   (match_operand:DI 1 "register_operand")
+   (match_operand 2 "")]
+  "TARGET_ARM"
+{
+
+  emit_insn (gen_negdi2_compare (operands[0], operands[1]));
+
+  rtx x;
+  x = gen_rtx_NE (VOIDmode, gen_rtx_REG (CC_Vmode, CC_REGNUM), const0_rtx);
+  x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
+			    gen_rtx_LABEL_REF (VOIDmode, operands[2]),
+			    pc_rtx);
+  emit_jump_insn (gen_rtx_SET (pc_rtx, x));
+  DONE;
+})
+
+
+(define_insn_and_split "negdi2_compare"
+  [(set (reg:CC CC_REGNUM)
+	(compare:CC
+	  (const_int 0)
+	  (match_operand:DI 1 "register_operand" "0,r")))
+   (set (match_operand:DI 0 "register_operand" "=r,&r")
+	(minus:DI (const_int 0) (match_dup 1)))]
+  "TARGET_ARM"
+  "#"
+  "TARGET_ARM && reload_completed"
+  [(parallel [(set (reg:CC CC_REGNUM)
+		   (compare:CC (const_int 0) (match_dup 1)))
+	      (set (match_dup 0) (minus:SI (const_int 0)
+					   (match_dup 1)))])
+   (parallel [(set (reg:CC CC_REGNUM)
+		   (compare:CC (const_int 0) (match_dup 3)))
+	     (set (match_dup 2)
+		  (minus:SI
+		   (minus:SI (const_int 0) (match_dup 3))
+		   (ltu:SI (reg:CC_C CC_REGNUM)
+			   (const_int 0))))])]
+  {
+    operands[2] = gen_highpart (SImode, operands[0]);
+    operands[0] = gen_lowpart (SImode, operands[0]);
+    operands[3] = gen_highpart (SImode, operands[1]);
+    operands[1] = gen_lowpart (SImode, operands[1]);
+  }
+  [(set_attr "conds" "set")
+   (set_attr "length" "8")
+   (set_attr "type" "multiple")]
+)
+
 (define_expand "negdi2"
  [(parallel
    [(set (match_operand:DI 0 "s_register_operand" "")
@@ -4389,6 +4731,20 @@ 
    (set_attr "type" "multiple")]
 )
 
+(define_insn "*negsi2_carryin_compare"
+  [(set (reg:CC CC_REGNUM)
+	(compare:CC (const_int 0)
+		    (match_operand:SI 1 "s_register_operand" "r")))
+   (set (match_operand:SI 0 "s_register_operand" "=r")
+	(minus:SI (minus:SI (const_int 0)
+			    (match_dup 1))
+		  (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))]
+  "TARGET_ARM"
+  "rscs\\t%0, %1, #0"
+  [(set_attr "conds" "set")
+   (set_attr "type" "alus_imm")]
+)
+
 (define_expand "negsi2"
   [(set (match_operand:SI         0 "s_register_operand" "")
 	(neg:SI (match_operand:SI 1 "s_register_operand" "")))]
diff --git a/gcc/testsuite/gcc.target/arm/builtin_saddl.c b/gcc/testsuite/gcc.target/arm/builtin_saddl.c
new file mode 100644
index 0000000..e9dab3c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/builtin_saddl.c
@@ -0,0 +1,17 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2" }  */
+/* { dg-require-effective-target arm32 } */
+extern void overflow_handler ();
+
+long overflow_add (long x, long y)
+{
+  long r;
+
+  int ovr = __builtin_saddl_overflow (x, y, &r);
+  if (ovr)
+    overflow_handler ();
+
+  return r;
+}
+
+/* {dg-final { scan-assembler "adds" }} */
diff --git a/gcc/testsuite/gcc.target/arm/builtin_saddll.c b/gcc/testsuite/gcc.target/arm/builtin_saddll.c
new file mode 100644
index 0000000..c4db30e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/builtin_saddll.c
@@ -0,0 +1,18 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2" }  */
+/* { dg-require-effective-target arm32 } */
+extern void overflow_handler ();
+
+long long overflow_add (long long x, long long y)
+{
+  long long r;
+
+  int ovr = __builtin_saddll_overflow (x, y, &r);
+  if (ovr)
+    overflow_handler ();
+
+  return r;
+}
+
+/* {dg-final { scan-assembler "adds" }} */
+/* {dg-final { scan-assembler "adcs" }} */
diff --git a/gcc/testsuite/gcc.target/arm/builtin_ssubl.c b/gcc/testsuite/gcc.target/arm/builtin_ssubl.c
new file mode 100644
index 0000000..766cddb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/builtin_ssubl.c
@@ -0,0 +1,17 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2" }  */
+/* { dg-require-effective-target arm32 } */
+extern void overflow_handler ();
+
+long overflow_sub (long x, long y)
+{
+  long r;
+
+  int ovr = __builtin_ssubl_overflow (x, y, &r);
+  if (ovr)
+    overflow_handler ();
+
+  return r;
+}
+
+/* {dg-final { scan-assembler "subs" }} */
diff --git a/gcc/testsuite/gcc.target/arm/builtin_ssubll.c b/gcc/testsuite/gcc.target/arm/builtin_ssubll.c
new file mode 100644
index 0000000..085e92a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/builtin_ssubll.c
@@ -0,0 +1,18 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2" }  */
+/* { dg-require-effective-target arm32 } */
+extern void overflow_handler ();
+
+long long overflow_sub (long long x, long long y)
+{
+  long long r;
+
+  int ovr = __builtin_ssubll_overflow (x, y, &r);
+  if (ovr)
+    overflow_handler ();
+
+  return r;
+}
+
+/* {dg-final { scan-assembler "subs" }} */
+/* {dg-final { scan-assembler "sbcs" }} */
diff --git a/gcc/testsuite/gcc.target/arm/builtin_uaddl.c b/gcc/testsuite/gcc.target/arm/builtin_uaddl.c
new file mode 100644
index 0000000..1ea59f8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/builtin_uaddl.c
@@ -0,0 +1,17 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2" }  */
+/* { dg-require-effective-target arm32 } */
+extern void overflow_handler ();
+
+unsigned long overflow_add (unsigned long x, unsigned long y)
+{
+  unsigned long r;
+
+  int ovr = __builtin_uaddl_overflow (x, y, &r);
+  if (ovr)
+    overflow_handler ();
+
+  return r;
+}
+
+/* {dg-final { scan-assembler "adds" }} */
diff --git a/gcc/testsuite/gcc.target/arm/builtin_uaddll.c b/gcc/testsuite/gcc.target/arm/builtin_uaddll.c
new file mode 100644
index 0000000..acf21a0c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/builtin_uaddll.c
@@ -0,0 +1,18 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2" }  */
+/* { dg-require-effective-target arm32 } */
+extern void overflow_handler ();
+
+unsigned long long overflow_add (unsigned long long x, unsigned long long y)
+{
+  unsigned long long r;
+
+  int ovr = __builtin_uaddll_overflow (x, y, &r);
+  if (ovr)
+    overflow_handler ();
+
+  return r;
+}
+
+/* {dg-final { scan-assembler "adds" }} */
+/* {dg-final { scan-assembler "adcs" }} */
diff --git a/gcc/testsuite/gcc.target/arm/builtin_usubl.c b/gcc/testsuite/gcc.target/arm/builtin_usubl.c
new file mode 100644
index 0000000..9c508cc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/builtin_usubl.c
@@ -0,0 +1,17 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2" }  */
+/* { dg-require-effective-target arm32 } */
+extern void overflow_handler ();
+
+unsigned long overflow_sub (unsigned long x, unsigned long y)
+{
+  unsigned long r;
+
+  int ovr = __builtin_usubl_overflow (x, y, &r);
+  if (ovr)
+    overflow_handler ();
+
+  return r;
+}
+
+/* {dg-final { scan-assembler "subs" }} */
diff --git a/gcc/testsuite/gcc.target/arm/builtin_usubll.c b/gcc/testsuite/gcc.target/arm/builtin_usubll.c
new file mode 100644
index 0000000..8f513f8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/builtin_usubll.c
@@ -0,0 +1,18 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2" }  */
+/* { dg-require-effective-target arm32 } */
+extern void overflow_handler ();
+
+unsigned long long overflow_sub (unsigned long long x, unsigned long long y)
+{
+  unsigned long long r;
+
+  int ovr = __builtin_usubll_overflow (x, y, &r);
+  if (ovr)
+    overflow_handler ();
+
+  return r;
+}
+
+/* {dg-final { scan-assembler "subs" }} */
+/* {dg-final { scan-assembler "sbcs" }} */
-- 
1.9.1