diff mbox series

ARM cmpsi2_addneg fix follow-up (PR target/89506)

Message ID 20190304090401.GG7611@tucnak
State New
Headers show
Series ARM cmpsi2_addneg fix follow-up (PR target/89506) | expand

Commit Message

Jakub Jelinek March 4, 2019, 9:04 a.m. UTC
On Fri, Mar 01, 2019 at 03:41:33PM +0000, Wilco Dijkstra wrote:
> > and regtest revealed two code size
> > regressions because of that.  Is -1 vs. 1 the only case of immediate
> > valid for both "I" and "L" constraints where the former is longer than the
> > latter?
> 
> Yes -1 is the only case which can result in larger code on Thumb-2, so -1 should
> be disallowed by the I constraint (or even better, the underlying query). That way
> it will work correctly for all add/sub patterns, not just this one.

So, over the weekend I've bootstrapped/regtested on armv7hl-linux-gnueabi
following two possible follow-ups which handle the -1 and 1 cases right
(prefer the instruction with #1 for thumb2), 0 and INT_MIN (use subs) and
for others use subs if both constraints match, otherwise adds.

The first one uses constraints and no C code in the output, I believe it is
actually more expensive for compile time, because if one just reads what
constrain_operands needs to do for another constraint, it is quite a lot.
I've tried to at least not introduce new constraints for this, there is no
constraint for number 1 (or for number -1).
The Pu constraint is thumb2 only for numbers 1..8, and the alternative uses
I constraint for the negation of it, i.e. -8..-1, only -1 from this is
valid for I.  If that matches, we emit adds with #1, otherwise just prefer
subs over adds.

The other swaps the alternatives similarly to the above, but for the special
case of desirable adds with #1 uses C code instead of another alternative.

Ok for trunk (which one)?

	Jakub
2019-03-04  Jakub Jelinek  <jakub@redhat.com>

	PR target/89506
	* config/arm/arm.md (cmpsi2_addneg): Swap the alternatives, add
	another alternative with I constraint for operands[2] and Pu
	for operands[3] and emit adds in that case, don't use C code to
	emit the instruction.
2019-03-04  Jakub Jelinek  <jakub@redhat.com>

	PR target/89506
	* config/arm/arm.md (cmpsi2_addneg): Swap the alternatives and use
	subs for the first alternative except when operands[3] is 1.

--- gcc/config/arm/arm.md.jj	2019-03-02 09:04:25.550794239 +0100
+++ gcc/config/arm/arm.md	2019-03-02 09:41:03.501404694 +0100
@@ -857,27 +857,27 @@ (define_insn "*compare_negsi_si"
    (set_attr "type" "alus_sreg")]
 )
 
-;; This is the canonicalization of addsi3_compare0_for_combiner when the
+;; This is the canonicalization of subsi3_compare when the
 ;; addend is a constant.
 (define_insn "cmpsi2_addneg"
   [(set (reg:CC CC_REGNUM)
 	(compare:CC
 	 (match_operand:SI 1 "s_register_operand" "r,r")
-	 (match_operand:SI 2 "arm_addimm_operand" "L,I")))
+	 (match_operand:SI 2 "arm_addimm_operand" "I,L")))
    (set (match_operand:SI 0 "s_register_operand" "=r,r")
 	(plus:SI (match_dup 1)
-		 (match_operand:SI 3 "arm_addimm_operand" "I,L")))]
+		 (match_operand:SI 3 "arm_addimm_operand" "L,I")))]
   "TARGET_32BIT
    && (INTVAL (operands[2])
        == trunc_int_for_mode (-INTVAL (operands[3]), SImode))"
 {
-  /* For 0 and INT_MIN it is essential that we use subs, as adds
-     will result in different condition codes (like cmn rather than
-     like cmp).  For other immediates, we should choose whatever
-     will have smaller encoding.  */
-  if (operands[2] == const0_rtx
-      || INTVAL (operands[2]) == -HOST_WIDE_INT_C (0x80000000)
-      || which_alternative == 1)
+  /* For 0 and INT_MIN it is essential that we use subs, as adds will result
+     in different condition codes (like cmn rather than like cmp), so that
+     alternative comes first.  Both alternatives can match for any 0x??000000
+     where except for 0 and INT_MIN it doesn't matter what we choose, and also
+     for -1 and 1 with TARGET_THUMB2, in that case prefer instruction with #1
+     as it is shorter.  */
+  if (which_alternative == 0 && operands[3] != const1_rtx)
     return "subs%?\\t%0, %1, #%n3";
   else
     return "adds%?\\t%0, %1, %3";

Comments

Jakub Jelinek March 12, 2019, 11:43 a.m. UTC | #1
On Mon, Mar 04, 2019 at 10:04:01AM +0100, Jakub Jelinek wrote:
> The first one uses constraints and no C code in the output, I believe it is
> actually more expensive for compile time, because if one just reads what
> constrain_operands needs to do for another constraint, it is quite a lot.
> I've tried to at least not introduce new constraints for this, there is no
> constraint for number 1 (or for number -1).
> The Pu constraint is thumb2 only for numbers 1..8, and the alternative uses
> I constraint for the negation of it, i.e. -8..-1, only -1 from this is
> valid for I.  If that matches, we emit adds with #1, otherwise just prefer
> subs over adds.
> 
> The other swaps the alternatives similarly to the above, but for the special
> case of desirable adds with #1 uses C code instead of another alternative.
> 
> Ok for trunk (which one)?

I'd like to ping these patches:

> 2019-03-04  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR target/89506
> 	* config/arm/arm.md (cmpsi2_addneg): Swap the alternatives, add
> 	another alternative with I constraint for operands[2] and Pu
> 	for operands[3] and emit adds in that case, don't use C code to
> 	emit the instruction.

or:

> 2019-03-04  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR target/89506
> 	* config/arm/arm.md (cmpsi2_addneg): Swap the alternatives and use
> 	subs for the first alternative except when operands[3] is 1.

	Jakub
Jakub Jelinek March 19, 2019, 7:27 a.m. UTC | #2
On Tue, Mar 12, 2019 at 12:43:29PM +0100, Jakub Jelinek wrote:
> On Mon, Mar 04, 2019 at 10:04:01AM +0100, Jakub Jelinek wrote:
> > The first one uses constraints and no C code in the output, I believe it is
> > actually more expensive for compile time, because if one just reads what
> > constrain_operands needs to do for another constraint, it is quite a lot.
> > I've tried to at least not introduce new constraints for this, there is no
> > constraint for number 1 (or for number -1).
> > The Pu constraint is thumb2 only for numbers 1..8, and the alternative uses
> > I constraint for the negation of it, i.e. -8..-1, only -1 from this is
> > valid for I.  If that matches, we emit adds with #1, otherwise just prefer
> > subs over adds.
> > 
> > The other swaps the alternatives similarly to the above, but for the special
> > case of desirable adds with #1 uses C code instead of another alternative.
> > 
> > Ok for trunk (which one)?
> 
> I'd like to ping these patches:
> 
> > 2019-03-04  Jakub Jelinek  <jakub@redhat.com>
> > 
> > 	PR target/89506
> > 	* config/arm/arm.md (cmpsi2_addneg): Swap the alternatives, add
> > 	another alternative with I constraint for operands[2] and Pu
> > 	for operands[3] and emit adds in that case, don't use C code to
> > 	emit the instruction.
> 
> or:
> 
> > 2019-03-04  Jakub Jelinek  <jakub@redhat.com>
> > 
> > 	PR target/89506
> > 	* config/arm/arm.md (cmpsi2_addneg): Swap the alternatives and use
> > 	subs for the first alternative except when operands[3] is 1.

Ping.

	Jakub
Ramana Radhakrishnan March 19, 2019, 10 a.m. UTC | #3
On 04/03/2019 09:04, Jakub Jelinek wrote:
> On Fri, Mar 01, 2019 at 03:41:33PM +0000, Wilco Dijkstra wrote:
>> > and regtest revealed two code size
>> > regressions because of that.  Is -1 vs. 1 the only case of immediate
>> > valid for both "I" and "L" constraints where the former is longer than the
>> > latter?
>> 
>> Yes -1 is the only case which can result in larger code on Thumb-2, so -1 should
>> be disallowed by the I constraint (or even better, the underlying query). That way
>> it will work correctly for all add/sub patterns, not just this one.
> 
> So, over the weekend I've bootstrapped/regtested on armv7hl-linux-gnueabi
> following two possible follow-ups which handle the -1 and 1 cases right
> (prefer the instruction with #1 for thumb2), 0 and INT_MIN (use subs) and
> for others use subs if both constraints match, otherwise adds.
> 
> The first one uses constraints and no C code in the output, I believe it is
> actually more expensive for compile time, because if one just reads what
> constrain_operands needs to do for another constraint, it is quite a lot.
> I've tried to at least not introduce new constraints for this, there is no
> constraint for number 1 (or for number -1).
> The Pu constraint is thumb2 only for numbers 1..8, and the alternative uses
> I constraint for the negation of it, i.e. -8..-1, only -1 from this is
> valid for I.  If that matches, we emit adds with #1, otherwise just prefer
> subs over adds.
> 
> The other swaps the alternatives similarly to the above, but for the special
> case of desirable adds with #1 uses C code instead of another alternative.
> 
> Ok for trunk (which one)?
> 
>          Jakub


Option #2 is better (the C code variant)- for something like this adding 
one more constraint is a bit painful.

Ok and watch out for any regressions as usual.

Ramana
diff mbox series

Patch

--- gcc/config/arm/arm.md.jj	2019-03-02 09:04:25.550794239 +0100
+++ gcc/config/arm/arm.md	2019-03-02 17:08:13.036725812 +0100
@@ -857,31 +857,31 @@  (define_insn "*compare_negsi_si"
    (set_attr "type" "alus_sreg")]
 )
 
-;; This is the canonicalization of addsi3_compare0_for_combiner when the
+;; This is the canonicalization of subsi3_compare when the
 ;; addend is a constant.
+;; For 0 and INT_MIN it is essential that we use subs, as adds will result
+;; in different condition codes (like cmn rather than like cmp), so that
+;; alternative comes first.  Both I and L constraints can match for any
+;; 0x??000000 where except for 0 and INT_MIN it doesn't matter what we choose,
+;; and also for -1 and 1 with TARGET_THUMB2, in that case prefer instruction
+;; with #1 as it is shorter.  The first alternative will use adds ?, ?, #1 over
+;; subs ?, ?, #-1, the second alternative will use subs for #0 or #2147483648
+;; or any other case where both I and L constraints match.
 (define_insn "cmpsi2_addneg"
   [(set (reg:CC CC_REGNUM)
 	(compare:CC
-	 (match_operand:SI 1 "s_register_operand" "r,r")
-	 (match_operand:SI 2 "arm_addimm_operand" "L,I")))
-   (set (match_operand:SI 0 "s_register_operand" "=r,r")
+	 (match_operand:SI 1 "s_register_operand" "r,r,r")
+	 (match_operand:SI 2 "arm_addimm_operand" "I,I,L")))
+   (set (match_operand:SI 0 "s_register_operand" "=r,r,r")
 	(plus:SI (match_dup 1)
-		 (match_operand:SI 3 "arm_addimm_operand" "I,L")))]
+		 (match_operand:SI 3 "arm_addimm_operand" "Pu,L,I")))]
   "TARGET_32BIT
    && (INTVAL (operands[2])
        == trunc_int_for_mode (-INTVAL (operands[3]), SImode))"
-{
-  /* For 0 and INT_MIN it is essential that we use subs, as adds
-     will result in different condition codes (like cmn rather than
-     like cmp).  For other immediates, we should choose whatever
-     will have smaller encoding.  */
-  if (operands[2] == const0_rtx
-      || INTVAL (operands[2]) == -HOST_WIDE_INT_C (0x80000000)
-      || which_alternative == 1)
-    return "subs%?\\t%0, %1, #%n3";
-  else
-    return "adds%?\\t%0, %1, %3";
-}
+  "@
+   adds%?\\t%0, %1, %3
+   subs%?\\t%0, %1, #%n3
+   adds%?\\t%0, %1, %3"
   [(set_attr "conds" "set")
    (set_attr "type" "alus_sreg")]
 )