diff mbox series

[v2,3/9] aarch64: Add <su>cmp_*_carryinC patterns

Message ID 20200321024231.13778-4-richard.henderson@linaro.org
State New
Headers show
Series aarch64: Implement TImode comparisons | expand

Commit Message

Li, Pan2 via Gcc-patches March 21, 2020, 2:42 a.m. UTC
Duplicate all usub_*_carryinC, but use xzr for the output when we
only require the flags output.  The signed versions use sign_extend
instead of zero_extend for combine's benefit.

These will be used shortly for TImode comparisons.

	* config/aarch64/aarch64.md (<su>cmp<GPI>3_carryinC): New.
	(*<su>cmp<GPI>3_carryinC_z1): New.
	(*<su>cmp<GPI>3_carryinC_z2): New.
	(*<su>cmp<GPI>3_carryinC): New.
---
 gcc/config/aarch64/aarch64.md | 50 +++++++++++++++++++++++++++++++++++
 1 file changed, 50 insertions(+)

Comments

Segher Boessenkool March 22, 2020, 7:30 p.m. UTC | #1
Hi!

On Fri, Mar 20, 2020 at 07:42:25PM -0700, Richard Henderson via Gcc-patches wrote:
> Duplicate all usub_*_carryinC, but use xzr for the output when we
> only require the flags output.  The signed versions use sign_extend
> instead of zero_extend for combine's benefit.

You actually use ANY_EXTEND, which makes a lot more sense :-)

Did you see combine create a sign_extend, ever?  Or do those just come
from combining other insns that already contain a sign_extend?


Segher
Li, Pan2 via Gcc-patches March 22, 2020, 8:40 p.m. UTC | #2
On 3/22/20 12:30 PM, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Mar 20, 2020 at 07:42:25PM -0700, Richard Henderson via Gcc-patches wrote:
>> Duplicate all usub_*_carryinC, but use xzr for the output when we
>> only require the flags output.  The signed versions use sign_extend
>> instead of zero_extend for combine's benefit.
> 
> You actually use ANY_EXTEND, which makes a lot more sense :-)
> 
> Did you see combine create a sign_extend, ever?  Or do those just come
> from combining other insns that already contain a sign_extend?

In the penultimate patch, for cmpti, I emit this sign_extend'ed pattern
manually, so that rtl actually gets the proper description of the comparison of
the high-half of the TImode variable.


r~
Richard Sandiford March 31, 2020, 6:34 p.m. UTC | #3
Richard Henderson <richard.henderson@linaro.org> writes:
> Duplicate all usub_*_carryinC, but use xzr for the output when we
> only require the flags output.  The signed versions use sign_extend
> instead of zero_extend for combine's benefit.
>
> These will be used shortly for TImode comparisons.
>
> 	* config/aarch64/aarch64.md (<su>cmp<GPI>3_carryinC): New.
> 	(*<su>cmp<GPI>3_carryinC_z1): New.
> 	(*<su>cmp<GPI>3_carryinC_z2): New.
> 	(*<su>cmp<GPI>3_carryinC): New.
> ---
>  gcc/config/aarch64/aarch64.md | 50 +++++++++++++++++++++++++++++++++++
>  1 file changed, 50 insertions(+)
>
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index a996a5f1c39..9b1c3f797f9 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -3440,6 +3440,18 @@
>     ""
>  )
>  
> +(define_expand "<ANY_EXTEND:su>cmp<GPI:mode>3_carryinC"
> +   [(set (reg:CC CC_REGNUM)
> +	 (compare:CC
> +	   (ANY_EXTEND:<DWI>
> +	     (match_operand:GPI 0 "register_operand"))
> +	   (plus:<DWI>
> +	     (ANY_EXTEND:<DWI>
> +	       (match_operand:GPI 1 "register_operand"))
> +	     (ltu:<DWI> (reg:CC CC_REGNUM) (const_int 0)))))]
> +   ""
> +)
>  (define_insn "*usub<GPI:mode>3_carryinC_z1"
>    [(set (reg:CC CC_REGNUM)
>  	(compare:CC
> @@ -3457,6 +3469,19 @@
>    [(set_attr "type" "adc_reg")]
>  )
>  
> +(define_insn "*<ANY_EXTEND:su>cmp<GPI:mode>3_carryinC_z1"
> +  [(set (reg:CC CC_REGNUM)
> +	(compare:CC
> +	  (const_int 0)
> +	  (plus:<DWI>
> +	    (ANY_EXTEND:<DWI>
> +	      (match_operand:GPI 0 "register_operand" "r"))
> +	    (match_operand:<DWI> 1 "aarch64_borrow_operation" ""))))]
> +   ""
> +   "sbcs\\t<w>zr, <w>zr, %<w>0"
> +  [(set_attr "type" "adc_reg")]
> +)
> +
>  (define_insn "*usub<GPI:mode>3_carryinC_z2"
>    [(set (reg:CC CC_REGNUM)
>  	(compare:CC
> @@ -3472,6 +3497,17 @@
>    [(set_attr "type" "adc_reg")]
>  )
>  
> +(define_insn "*<ANY_EXTEND:su>cmp<GPI:mode>3_carryinC_z2"
> +  [(set (reg:CC CC_REGNUM)
> +	(compare:CC
> +	  (ANY_EXTEND:<DWI>
> +	    (match_operand:GPI 0 "register_operand" "r"))
> +	  (match_operand:<DWI> 1 "aarch64_borrow_operation" "")))]
> +   ""
> +   "sbcs\\t<w>zr, %<w>0, <w>zr"
> +  [(set_attr "type" "adc_reg")]
> +)
> +
>  (define_insn "*usub<GPI:mode>3_carryinC"
>    [(set (reg:CC CC_REGNUM)
>  	(compare:CC
> @@ -3490,6 +3526,20 @@
>    [(set_attr "type" "adc_reg")]
>  )
>  
> +(define_insn "*<ANY_EXTEND:su>cmp<GPI:mode>3_carryinC"
> +  [(set (reg:CC CC_REGNUM)
> +	(compare:CC
> +	  (ANY_EXTEND:<DWI>
> +	    (match_operand:GPI 0 "register_operand" "r"))
> +	  (plus:<DWI>
> +	    (ANY_EXTEND:<DWI>
> +	      (match_operand:GPI 1 "register_operand" "r"))
> +	    (match_operand:<DWI> 2 "aarch64_borrow_operation" ""))))]
> +   ""
> +   "sbcs\\t<w>zr, %<w>0, %<w>1"
> +  [(set_attr "type" "adc_reg")]
> +)

I guess this feeds into your reply to Segher's comment for 7/9,
but I think:

   (compare:CC X Y)

is always supposed to be the NZCV flags result of X - Y, as computed in
the mode of X and Y.  If so, it seems like the type of extension should
matter.  E.g. the N flag ought to be set for:

  (compare:CC
    (sign_extend 0xf...)
    (plus (sign_extend 0x7...)
          (ltu ...)))

but ought to be clear for:

  (compare:CC
    (zero_extend 0xf...)
    (plus (zero_extend 0x7...)
          (ltu ...)))

If so, I guess this is a bug in the existing code...

Thanks,
Richard
Li, Pan2 via Gcc-patches March 31, 2020, 10:44 p.m. UTC | #4
On 3/31/20 11:34 AM, Richard Sandiford wrote:
>> +(define_insn "*<ANY_EXTEND:su>cmp<GPI:mode>3_carryinC"
>> +  [(set (reg:CC CC_REGNUM)
>> +	(compare:CC
>> +	  (ANY_EXTEND:<DWI>
>> +	    (match_operand:GPI 0 "register_operand" "r"))
>> +	  (plus:<DWI>
>> +	    (ANY_EXTEND:<DWI>
>> +	      (match_operand:GPI 1 "register_operand" "r"))
>> +	    (match_operand:<DWI> 2 "aarch64_borrow_operation" ""))))]
>> +   ""
>> +   "sbcs\\t<w>zr, %<w>0, %<w>1"
>> +  [(set_attr "type" "adc_reg")]
>> +)
> 
> I guess this feeds into your reply to Segher's comment for 7/9,
> but I think:
> 
>    (compare:CC X Y)
> 
> is always supposed to be the NZCV flags result of X - Y, as computed in
> the mode of X and Y.  If so, it seems like the type of extension should
> matter.  E.g. the N flag ought to be set for:
> 
>   (compare:CC
>     (sign_extend 0xf...)
>     (plus (sign_extend 0x7...)
>           (ltu ...)))
> 
> but ought to be clear for:
> 
>   (compare:CC
>     (zero_extend 0xf...)
>     (plus (zero_extend 0x7...)
>           (ltu ...)))
> 
> If so, I guess this is a bug in the existing code...

The subject of CCmodes is a sticky one.  It mostly all depends on what combine
is able to do with the patterns.

For instance, your choice of example above, even for signed, the N bit cannot
be examined by itself, because that would only be valid for a comparison
against zero, like

    (compare (plus (reg) (reg))
             (const_int 0))

For this particular bit of rtl, the only valid comparison is N == V, i.e. GE/LT.

If we add a new CC mode for this, what would you call it?  Probably not
CC_NVmode, because to me that implies you can use either N or V, but it doesn't
imply you must examine both.

If we add more CC modes, does that mean that we have to improve SELECT_CC_MODE
to match those patterns?  Or do we add new CC modes just so that combine's use
of SELECT_CC_MODE *cannot* match them?


r~
Segher Boessenkool April 1, 2020, 12:37 p.m. UTC | #5
On Tue, Mar 31, 2020 at 03:44:51PM -0700, Richard Henderson wrote:
> If we add more CC modes, does that mean that we have to improve SELECT_CC_MODE
> to match those patterns?  Or do we add new CC modes just so that combine's use
> of SELECT_CC_MODE *cannot* match them?

Adding new CC modes isn't helpful if nothing ever generates them.  To
have combine decide to use a different CC mode for a combined insn, you
need SELECT_CC_MODE.

The result of a combination is always valid (as RTL) with the original
CC mode, but such an insn might not exist, and perhaps it does exist
with some weaker CC mode (or more general CC mode, in some cases).


Segher
Richard Sandiford April 1, 2020, 4:28 p.m. UTC | #6
Richard Henderson <richard.henderson@linaro.org> writes:
> On 3/31/20 11:34 AM, Richard Sandiford wrote:
>>> +(define_insn "*<ANY_EXTEND:su>cmp<GPI:mode>3_carryinC"
>>> +  [(set (reg:CC CC_REGNUM)
>>> +	(compare:CC
>>> +	  (ANY_EXTEND:<DWI>
>>> +	    (match_operand:GPI 0 "register_operand" "r"))
>>> +	  (plus:<DWI>
>>> +	    (ANY_EXTEND:<DWI>
>>> +	      (match_operand:GPI 1 "register_operand" "r"))
>>> +	    (match_operand:<DWI> 2 "aarch64_borrow_operation" ""))))]
>>> +   ""
>>> +   "sbcs\\t<w>zr, %<w>0, %<w>1"
>>> +  [(set_attr "type" "adc_reg")]
>>> +)
>> 
>> I guess this feeds into your reply to Segher's comment for 7/9,
>> but I think:
>> 
>>    (compare:CC X Y)
>> 
>> is always supposed to be the NZCV flags result of X - Y, as computed in
>> the mode of X and Y.  If so, it seems like the type of extension should
>> matter.  E.g. the N flag ought to be set for:
>> 
>>   (compare:CC
>>     (sign_extend 0xf...)
>>     (plus (sign_extend 0x7...)
>>           (ltu ...)))
>> 
>> but ought to be clear for:
>> 
>>   (compare:CC
>>     (zero_extend 0xf...)
>>     (plus (zero_extend 0x7...)
>>           (ltu ...)))
>> 
>> If so, I guess this is a bug in the existing code...
>
> The subject of CCmodes is a sticky one.  It mostly all depends on what combine
> is able to do with the patterns.
>
> For instance, your choice of example above, even for signed, the N bit cannot
> be examined by itself, because that would only be valid for a comparison
> against zero, like
>
>     (compare (plus (reg) (reg))
>              (const_int 0))
>
> For this particular bit of rtl, the only valid comparison is N == V,
> i.e. GE/LT.
>
> If we add a new CC mode for this, what would you call it?  Probably not
> CC_NVmode, because to me that implies you can use either N or V, but it doesn't
> imply you must examine both.
>
> If we add more CC modes, does that mean that we have to improve SELECT_CC_MODE
> to match those patterns?  Or do we add new CC modes just so that combine's use
> of SELECT_CC_MODE *cannot* match them?

Yeah, looks awkward.  There isn't AFAICT any way to describe the full
NZCV result of SBCS as a compare, in the case where C is possibly zero.
So I guess if we're going to continue using compare then we need to cook
up a new mode like you say.

How important is it to describe the flags operation as a compare though?
Could we instead use an unspec with three inputs, and keep it as :CC?
That would still allow special-case matching for zero operands.

Thanks,
Richard
Li, Pan2 via Gcc-patches April 1, 2020, 5:14 p.m. UTC | #7
On 4/1/20 9:28 AM, Richard Sandiford wrote:
> How important is it to describe the flags operation as a compare though?
> Could we instead use an unspec with three inputs, and keep it as :CC?
> That would still allow special-case matching for zero operands.

I'm not sure.

My guess is that the only interesting optimization for ADC/SBC is when
optimization determines that the low-part of op2 is zero, so that we can fold

  [(set (reg cc) (compare ...))
   (set (reg t0) (sub (reg a0) (reg b0))]

  [(set (reg cc) (compare ...))
   (set (reg t1) (sub (reg a1)
		   (sub (reg b1)
		     (geu (reg cc) (const 0)))))]

to

  [(set (reg t0) (reg a0)]

  [(set (reg cc) (compare ...))
   (set (reg t1) (sub (reg a1) (reg b1))]

which combine should be able to do by propagating zeros across the compare+geu.

Though I suppose it's still possible to handle this with unspecs and
define_split, so that

  [(set (reg cc)
        (unspec [(reg a1) (reg b2) (geu ...)]
        UNSPEC_SBCS)
   (set (reg t1) ...)]

when the geu folds to (const_int 0), we can split this to a plain sub.

I'll see if I can make this work with a minimum of effort.


r~
diff mbox series

Patch

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index a996a5f1c39..9b1c3f797f9 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -3440,6 +3440,18 @@ 
    ""
 )
 
+(define_expand "<ANY_EXTEND:su>cmp<GPI:mode>3_carryinC"
+   [(set (reg:CC CC_REGNUM)
+	 (compare:CC
+	   (ANY_EXTEND:<DWI>
+	     (match_operand:GPI 0 "register_operand"))
+	   (plus:<DWI>
+	     (ANY_EXTEND:<DWI>
+	       (match_operand:GPI 1 "register_operand"))
+	     (ltu:<DWI> (reg:CC CC_REGNUM) (const_int 0)))))]
+   ""
+)
+
 (define_insn "*usub<GPI:mode>3_carryinC_z1"
   [(set (reg:CC CC_REGNUM)
 	(compare:CC
@@ -3457,6 +3469,19 @@ 
   [(set_attr "type" "adc_reg")]
 )
 
+(define_insn "*<ANY_EXTEND:su>cmp<GPI:mode>3_carryinC_z1"
+  [(set (reg:CC CC_REGNUM)
+	(compare:CC
+	  (const_int 0)
+	  (plus:<DWI>
+	    (ANY_EXTEND:<DWI>
+	      (match_operand:GPI 0 "register_operand" "r"))
+	    (match_operand:<DWI> 1 "aarch64_borrow_operation" ""))))]
+   ""
+   "sbcs\\t<w>zr, <w>zr, %<w>0"
+  [(set_attr "type" "adc_reg")]
+)
+
 (define_insn "*usub<GPI:mode>3_carryinC_z2"
   [(set (reg:CC CC_REGNUM)
 	(compare:CC
@@ -3472,6 +3497,17 @@ 
   [(set_attr "type" "adc_reg")]
 )
 
+(define_insn "*<ANY_EXTEND:su>cmp<GPI:mode>3_carryinC_z2"
+  [(set (reg:CC CC_REGNUM)
+	(compare:CC
+	  (ANY_EXTEND:<DWI>
+	    (match_operand:GPI 0 "register_operand" "r"))
+	  (match_operand:<DWI> 1 "aarch64_borrow_operation" "")))]
+   ""
+   "sbcs\\t<w>zr, %<w>0, <w>zr"
+  [(set_attr "type" "adc_reg")]
+)
+
 (define_insn "*usub<GPI:mode>3_carryinC"
   [(set (reg:CC CC_REGNUM)
 	(compare:CC
@@ -3490,6 +3526,20 @@ 
   [(set_attr "type" "adc_reg")]
 )
 
+(define_insn "*<ANY_EXTEND:su>cmp<GPI:mode>3_carryinC"
+  [(set (reg:CC CC_REGNUM)
+	(compare:CC
+	  (ANY_EXTEND:<DWI>
+	    (match_operand:GPI 0 "register_operand" "r"))
+	  (plus:<DWI>
+	    (ANY_EXTEND:<DWI>
+	      (match_operand:GPI 1 "register_operand" "r"))
+	    (match_operand:<DWI> 2 "aarch64_borrow_operation" ""))))]
+   ""
+   "sbcs\\t<w>zr, %<w>0, %<w>1"
+  [(set_attr "type" "adc_reg")]
+)
+
 (define_expand "sub<GPI:mode>3_carryinV"
   [(parallel
      [(set (reg:CC_V CC_REGNUM)