diff mbox

[AArch64,1/3] Don't disparage add/sub in SIMD registers

Message ID 53EA2767.5060301@arm.com
State New
Headers show

Commit Message

Alan Lawrence Aug. 12, 2014, 2:40 p.m. UTC
(It is no more expensive.)

gcc/ChangeLog:

	* config/aarch64/aarch64.md (subdi3, adddi3_aarch64): Don't penalize
	SIMD reg variant.

Comments

Andrew Pinski Aug. 12, 2014, 3:53 p.m. UTC | #1
> On Aug 12, 2014, at 7:40 AM, Alan Lawrence <alan.lawrence@arm.com> wrote:
> 
> (It is no more expensive.)

Yes on some processors it could be. 

Thanks,
Andrew


> 
> gcc/ChangeLog:
> 
>    * config/aarch64/aarch64.md (subdi3, adddi3_aarch64): Don't penalize
>    SIMD reg variant.
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index f8eb305140e7b0aed006b33f1724a90939e48316..0a8ca4bcc7941f912c8d42200b33206d4188fa48 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -1188,10 +1188,10 @@ (define_insn "*adddi3_aarch64" [(set - (match_operand:DI 0 "register_operand" "=rk,rk,rk,!w") + (match_operand:DI 0 "register_operand" "=rk,rk,rk,w") (plus:DI - (match_operand:DI 1 "register_operand" "%rk,rk,rk,!w") - (match_operand:DI 2 "aarch64_plus_operand" "I,r,J,!w")))] + (match_operand:DI 1 "register_operand" "%rk,rk,rk,w") + (match_operand:DI 2 "aarch64_plus_operand" "I,r,J,w")))] "" "@ add\\t%x0, %x1, %2 @@ -1662,9 +1662,9 @@ ) (define_insn "subdi3" - [(set (match_operand:DI 0 "register_operand" "=rk,!w") -	(minus:DI (match_operand:DI 1 "register_operand" "r,!w") -	 (match_operand:DI 2 "register_operand" "r,!w")))] + [(set (match_operand:DI 0 "register_operand" "=rk,w") +	(minus:DI (match_operand:DI 1 "register_operand" "r,w") +	 (match_operand:DI 2 "register_operand" "r,w")))] "" "@ sub\\t%x0, %x1, %x2
James Greenhalgh Aug. 13, 2014, 8:36 a.m. UTC | #2
On Tue, Aug 12, 2014 at 04:53:38PM +0100, pinskia@gmail.com wrote:
> 
> 
> > On Aug 12, 2014, at 7:40 AM, Alan Lawrence <alan.lawrence@arm.com> wrote:
> > 
> > (It is no more expensive.)
> 
> Yes on some processors it could be. 

Haven't we been here before [1]?

This disparaging mechanism is still not going to give what we are trying to
achieve (assigning cost at the -mcpu/-mtune level, rather than the
target level). 

Recall that '!' gives a static cost of 600 [2] and that this cost is
only applied to the alternative if any operand in that alternative needs
reloading - otherwise, LRA sees a set of matching operands and does not
bother checking costs [3]. IRA, as far as I can see, does not care about
'!', but unconditionally adds 2 to the cost of an alternative for '?' [4].

Even if LRA did try to do a more complete job of always picking the
alternative with lowest cost (rather than the current first-matching
behaviour) "600" would be far too high a cost for the operation.

If IRA took '!' into account, we would be as well to remove the alternative
entirely.

So, I still can't agree that we want these exclamation marks - and we are
now in a halfway state where some instructions have them and some don't.
We have to pick a consistent policy or we are going to see some very poor
code generation.

In an ideal world, we would have a sensible way of describing a
(per-core granularity) alternative cost, which would be considered by
the register allocators. I've played about with doing this using attributes,
but it ends up as a messy patch and I can't bring myself to add yet another
cost framework to the back end.

Do you have any ideas as to how we can make some progress? Maybe Vladimir
has some suggestions?

Thanks,
James

[1] https://gcc.gnu.org/ml/gcc-patches/2014-03/msg01627.html
[2] regoc.c::preprocess_constraints
[3] I may have misread this, but that seems to be what the final condition
    of the main loop of lra-constraints::process_alt_operands implies.
[4] ira-costs.c::record_reg_classes
Vladimir Makarov Aug. 13, 2014, 4:07 p.m. UTC | #3
On 2014-08-13, 4:36 AM, James Greenhalgh wrote:
> On Tue, Aug 12, 2014 at 04:53:38PM +0100, pinskia@gmail.com wrote:
>>
>>
>>> On Aug 12, 2014, at 7:40 AM, Alan Lawrence <alan.lawrence@arm.com> wrote:
>>>
>>> (It is no more expensive.)
>>
>> Yes on some processors it could be.
>
> Haven't we been here before [1]?
>
> This disparaging mechanism is still not going to give what we are trying to
> achieve (assigning cost at the -mcpu/-mtune level, rather than the
> target level).
>
> Recall that '!' gives a static cost of 600 [2] and that this cost is
> only applied to the alternative if any operand in that alternative needs
> reloading - otherwise, LRA sees a set of matching operands and does not
> bother checking costs [3]. IRA, as far as I can see, does not care about
> '!', but unconditionally adds 2 to the cost of an alternative for '?' [4].
>

Yes, correct.  IRA follows the old global RA and the documentation.

> Even if LRA did try to do a more complete job of always picking the
> alternative with lowest cost (rather than the current first-matching
> behaviour) "600" would be far too high a cost for the operation.
>
> If IRA took '!' into account, we would be as well to remove the alternative
> entirely.
>

Yes, that is right.

> So, I still can't agree that we want these exclamation marks - and we are
> now in a halfway state where some instructions have them and some don't.
> We have to pick a consistent policy or we are going to see some very poor
> code generation.
>
> In an ideal world, we would have a sensible way of describing a
> (per-core granularity) alternative cost, which would be considered by
> the register allocators. I've played about with doing this using attributes,
> but it ends up as a messy patch and I can't bring myself to add yet another
> cost framework to the back end.
>
> Do you have any ideas as to how we can make some progress? Maybe Vladimir
> has some suggestions?
>

Yes, I have some thoughts.  We could provide machine-dependent hooks to 
treat different costs for '!' (which will work only for LRA/reload), and 
'?', and even '*' (to ignore or not the next register constraint for 
register preference).  The hook could get insn for which we calculate 
the costs and of course to have the current default values) for 
compatibility.

It could provide a lot of flexibility for machine-description writers.

Although it would create some conflicts with insn cost attribute 
calculation by alternatives and would create more complication for less 
experienced md writer.   So we need some more consistent solution.  It 
is all about code selection which currently is done in many places 
(combiner, register preferences and finally in reload/LRA).  A lot of 
thinking should be done how to better approach to the solution.

The right approach would be reconsider all the pipeline and probably 
rewriting combiner/register preferencing (which is moved from the old 
regclass.c practically without changes and ignore the fact that the 
alternative should be the same for different operands) / partially LRA. 
  So it is complex task and a big project.  But I am going to work in 
this direction when I start to be less busy.  I am only afraid that the 
quick solutions as mentioned by me above could create a lot of 
complications for the long-term project.

> Thanks,
> James
>
> [1] https://gcc.gnu.org/ml/gcc-patches/2014-03/msg01627.html
> [2] regoc.c::preprocess_constraints
> [3] I may have misread this, but that seems to be what the final condition
>      of the main loop of lra-constraints::process_alt_operands implies.
> [4] ira-costs.c::record_reg_classes
>
Alan Lawrence Aug. 18, 2014, 4:50 p.m. UTC | #4
Well, you're right that it could be. So I presented the wrong justification.

Clearly we would benefit from some better cost infrastructure here, ideally that 
is expressive, taken into account at all appropriate stages of the compiler, and 
tunable per core. I imagine that steps (patches) towards such infrastructure 
would be welcomed by both AArch64 maintainers and more widely.

In the meantime, however, we must work with what we have. I'll still argue that 
we should remove the '!' (as per patch), however. As James has said, even if 
your add is more expensive in SIMD registers, the '!' still doesn't express 
that; and leaving it in affects code-generation on all cores. And it is 
inconsistent with other instructions.

--Alan

pinskia@gmail.com wrote:
> 
>> On Aug 12, 2014, at 7:40 AM, Alan Lawrence <alan.lawrence@arm.com> wrote:
>>
>> (It is no more expensive.)
> 
> Yes on some processors it could be. 
> 
> Thanks,
> Andrew
> 
> 
>> gcc/ChangeLog:
>>
>>    * config/aarch64/aarch64.md (subdi3, adddi3_aarch64): Don't penalize
>>    SIMD reg variant.
>> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index f8eb305140e7b0aed006b33f1724a90939e48316..0a8ca4bcc7941f912c8d42200b33206d4188fa48 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -1188,10 +1188,10 @@ (define_insn "*adddi3_aarch64" [(set - (match_operand:DI 0 "register_operand" "=rk,rk,rk,!w") + (match_operand:DI 0 "register_operand" "=rk,rk,rk,w") (plus:DI - (match_operand:DI 1 "register_operand" "%rk,rk,rk,!w") - (match_operand:DI 2 "aarch64_plus_operand" "I,r,J,!w")))] + (match_operand:DI 1 "register_operand" "%rk,rk,rk,w") + (match_operand:DI 2 "aarch64_plus_operand" "I,r,J,w")))] "" "@ add\\t%x0, %x1, %2 @@ -1662,9 +1662,9 @@ ) (define_insn "subdi3" - [(set (match_operand:DI 0 "register_operand" "=rk,!w") -	(minus:DI (match_operand:DI 1 "register_operand" "r,!w") -	 (match_operand:DI 2 "register_operand" "r,!w")))] + [(set (match_operand:DI 0 "register_operand" "=rk,w") +	(minus:DI (match_operand:DI 1 "
register_operand" "r,w") +	 (match_operand:DI 2 "register_operand" "r,w")))] "" "@ sub\\t%x0, %x1, %x2
>
Marcus Shawcroft Sept. 2, 2014, 3:08 p.m. UTC | #5
On 18 August 2014 17:50, Alan Lawrence <alan.lawrence@arm.com> wrote:
> Well, you're right that it could be. So I presented the wrong justification.
>
> Clearly we would benefit from some better cost infrastructure here, ideally
> that is expressive, taken into account at all appropriate stages of the
> compiler, and tunable per core. I imagine that steps (patches) towards such
> infrastructure would be welcomed by both AArch64 maintainers and more
> widely.
>
> In the meantime, however, we must work with what we have. I'll still argue
> that we should remove the '!' (as per patch), however. As James has said,
> even if your add is more expensive in SIMD registers, the '!' still doesn't
> express that; and leaving it in affects code-generation on all cores. And it
> is inconsistent with other instructions.

Agreed and OK. /Marcus
diff mbox

Patch

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index f8eb305140e7b0aed006b33f1724a90939e48316..0a8ca4bcc7941f912c8d42200b33206d4188fa48 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1188,10 +1188,10 @@ 
 
 (define_insn "*adddi3_aarch64"
   [(set
-    (match_operand:DI 0 "register_operand" "=rk,rk,rk,!w")
+    (match_operand:DI 0 "register_operand" "=rk,rk,rk,w")
     (plus:DI
-     (match_operand:DI 1 "register_operand" "%rk,rk,rk,!w")
-     (match_operand:DI 2 "aarch64_plus_operand" "I,r,J,!w")))]
+     (match_operand:DI 1 "register_operand" "%rk,rk,rk,w")
+     (match_operand:DI 2 "aarch64_plus_operand" "I,r,J,w")))]
   ""
   "@
   add\\t%x0, %x1, %2
@@ -1662,9 +1662,9 @@ 
 )
 
 (define_insn "subdi3"
-  [(set (match_operand:DI 0 "register_operand" "=rk,!w")
-	(minus:DI (match_operand:DI 1 "register_operand" "r,!w")
-		   (match_operand:DI 2 "register_operand" "r,!w")))]
+  [(set (match_operand:DI 0 "register_operand" "=rk,w")
+	(minus:DI (match_operand:DI 1 "register_operand" "r,w")
+		   (match_operand:DI 2 "register_operand" "r,w")))]
   ""
   "@
    sub\\t%x0, %x1, %x2