diff mbox series

rs6000: Disable -fcaller-saves by default

Message ID 8cc18935-a9aa-513f-eef2-154130b70a3b@linux.ibm.com
State New
Headers show
Series rs6000: Disable -fcaller-saves by default | expand

Commit Message

Pat Haugen July 23, 2020, 7:25 p.m. UTC
Disable -fcaller-saves by default.

This patch turns off -fcaller-saves by default so that IRA doesn't try
to use volatile regs for pseudos that are live across a call, which
would then require LRA to save/restore the reg around the call.

Bootstrap/regtest on powerpc64le with no new regressions. Also ran a
CPU2017 benchmark comparison with no major differences (a few minor
improvements and one minor degradation). Ok for trunk?

-Pat


2020-07-23  Pat Haugen  <pthaugen@linux.ibm.com>

gcc/
	* common/config/rs6000/rs6000-common.c
	(rs6000_option_optimization_table): Turn off -fcaller-saves.

gcc/testsuite/
	* gcc.target/powerpc/caller-saves.c: New.

Comments

Li, Pan2 via Gcc-patches July 23, 2020, 7:42 p.m. UTC | #1
On Thu, 2020-07-23 at 14:25 -0500, Pat Haugen via Gcc-patches wrote:
> Disable -fcaller-saves by default.
> 
> This patch turns off -fcaller-saves by default so that IRA doesn't try
> to use volatile regs for pseudos that are live across a call, which
> would then require LRA to save/restore the reg around the call.
> 
> Bootstrap/regtest on powerpc64le with no new regressions. Also ran a
> CPU2017 benchmark comparison with no major differences (a few minor
> improvements and one minor degradation). Ok for trunk?
> 
> -Pat
> 
> 
> 2020-07-23  Pat Haugen  <pthaugen@linux.ibm.com>
> 
> gcc/
> 	* common/config/rs6000/rs6000-common.c
> 	(rs6000_option_optimization_table): Turn off -fcaller-saves.
> 
> gcc/testsuite/
> 	* gcc.target/powerpc/caller-saves.c: New.
What's driving this change?

IRA will do a cost/benefit analysis to see using call clobbered registers like
this is profitable or not.  You're just turning the whole thing off.

This can be particularly bad for performance if you have classes with no call
saved registers.

Jeff
Segher Boessenkool July 23, 2020, 8:19 p.m. UTC | #2
Hi!

On Thu, Jul 23, 2020 at 01:42:59PM -0600, Jeff Law wrote:
> On Thu, 2020-07-23 at 14:25 -0500, Pat Haugen via Gcc-patches wrote:
> > Disable -fcaller-saves by default.
> > 
> > This patch turns off -fcaller-saves by default so that IRA doesn't try
> > to use volatile regs for pseudos that are live across a call, which
> > would then require LRA to save/restore the reg around the call.

> > 2020-07-23  Pat Haugen  <pthaugen@linux.ibm.com>
> > 
> > gcc/
> > 	* common/config/rs6000/rs6000-common.c
> > 	(rs6000_option_optimization_table): Turn off -fcaller-saves.
> > 
> > gcc/testsuite/
> > 	* gcc.target/powerpc/caller-saves.c: New.

> What's driving this change?

Peter noticed IRA allocates stuff to volatile registers while it is life
through a call, and then LRA has to correct that, not optimal.

> IRA will do a cost/benefit analysis to see using call clobbered registers like
> this is profitable or not.  You're just turning the whole thing off.

'-fcaller-saves'
     Enable allocation of values to registers that are clobbered by
     function calls, by emitting extra instructions to save and restore
     the registers around such calls.  Such allocation is done only when
     it seems to result in better code.

     This option is always enabled by default on certain machines,
     usually those which have no call-preserved registers to use
     instead.

So, given what Peter saw, the analysis when to do or not do this isn't
as good as could be hoped for.

> This can be particularly bad for performance if you have classes with no call
> saved registers.

Do we though?  Well, "d" for vectors, but there shouldn't be insns that
require that?


Segher
Li, Pan2 via Gcc-patches July 23, 2020, 8:29 p.m. UTC | #3
On Thu, 2020-07-23 at 15:19 -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Jul 23, 2020 at 01:42:59PM -0600, Jeff Law wrote:
> > On Thu, 2020-07-23 at 14:25 -0500, Pat Haugen via Gcc-patches wrote:
> > > Disable -fcaller-saves by default.
> > > 
> > > This patch turns off -fcaller-saves by default so that IRA doesn't try
> > > to use volatile regs for pseudos that are live across a call, which
> > > would then require LRA to save/restore the reg around the call.
> > > 2020-07-23  Pat Haugen  <pthaugen@linux.ibm.com>
> > > 
> > > gcc/
> > > 	* common/config/rs6000/rs6000-common.c
> > > 	(rs6000_option_optimization_table): Turn off -fcaller-saves.
> > > 
> > > gcc/testsuite/
> > > 	* gcc.target/powerpc/caller-saves.c: New.
> > What's driving this change?
> 
> Peter noticed IRA allocates stuff to volatile registers while it is life
> through a call, and then LRA has to correct that, not optimal.
I can't recall if IRA or LRA handles the need to save them to the stack, but the
whole point is you can do much better sometimes by saving into the stack with the
caller-saves algorithm vs just giving up and spilling.

> 
> > IRA will do a cost/benefit analysis to see using call clobbered registers like
> > this is profitable or not.  You're just turning the whole thing off.
> 
> '-fcaller-saves'
>      Enable allocation of values to registers that are clobbered by
>      function calls, by emitting extra instructions to save and restore
>      the registers around such calls.  Such allocation is done only when
>      it seems to result in better code.
> 
>      This option is always enabled by default on certain machines,
>      usually those which have no call-preserved registers to use
>      instead.
> 
> So, given what Peter saw, the analysis when to do or not do this isn't
> as good as could be hoped for.
Sure.  That obviously happens.  When it does the usual response is to dig deeper
into why, not turn the entire option off.

Jeff
Segher Boessenkool July 23, 2020, 8:43 p.m. UTC | #4
On Thu, Jul 23, 2020 at 02:29:00PM -0600, Jeff Law wrote:
> On Thu, 2020-07-23 at 15:19 -0500, Segher Boessenkool wrote:
> > On Thu, Jul 23, 2020 at 01:42:59PM -0600, Jeff Law wrote:
> > > On Thu, 2020-07-23 at 14:25 -0500, Pat Haugen via Gcc-patches wrote:
> > > > Disable -fcaller-saves by default.
> > > > 
> > > > This patch turns off -fcaller-saves by default so that IRA doesn't try
> > > > to use volatile regs for pseudos that are live across a call, which
> > > > would then require LRA to save/restore the reg around the call.
> > > > 2020-07-23  Pat Haugen  <pthaugen@linux.ibm.com>
> > > > 
> > > > gcc/
> > > > 	* common/config/rs6000/rs6000-common.c
> > > > 	(rs6000_option_optimization_table): Turn off -fcaller-saves.
> > > > 
> > > > gcc/testsuite/
> > > > 	* gcc.target/powerpc/caller-saves.c: New.
> > > What's driving this change?
> > 
> > Peter noticed IRA allocates stuff to volatile registers while it is life
> > through a call, and then LRA has to correct that, not optimal.
> I can't recall if IRA or LRA handles the need to save them to the stack, but the
> whole point is you can do much better sometimes by saving into the stack with the
> caller-saves algorithm vs just giving up and spilling.

But the only thing this flag influences is what registers IRA allocates,
nothing else, or what am I missing?

> > > IRA will do a cost/benefit analysis to see using call clobbered registers like
> > > this is profitable or not.  You're just turning the whole thing off.
> > 
> > '-fcaller-saves'
> >      Enable allocation of values to registers that are clobbered by
> >      function calls, by emitting extra instructions to save and restore
> >      the registers around such calls.  Such allocation is done only when
> >      it seems to result in better code.
> > 
> >      This option is always enabled by default on certain machines,
> >      usually those which have no call-preserved registers to use
> >      instead.

The documentation does not match reality btw, and even contradicts it
for many targets.

> > So, given what Peter saw, the analysis when to do or not do this isn't
> > as good as could be hoped for.
> Sure.  That obviously happens.  When it does the usual response is to dig deeper
> into why, not turn the entire option off.

That is after digging deeper!  Maybe not deep enough, hrm.  Peter?


Segher
Peter Bergner Aug. 10, 2020, 4:11 p.m. UTC | #5
On 7/23/20 3:29 PM, Jeff Law wrote:
>>> What's driving this change?
>>
>> Peter noticed IRA allocates stuff to volatile registers while it is life
>> through a call, and then LRA has to correct that, not optimal.
> I can't recall if IRA or LRA handles the need to save them to the stack, but the
> whole point is you can do much better sometimes by saving into the stack with the
> caller-saves algorithm vs just giving up and spilling.
> 
>>
>>> IRA will do a cost/benefit analysis to see using call clobbered registers like
>>> this is profitable or not.  You're just turning the whole thing off.


Sorry for taking so long to reply.  I'm the guilty party for asking Pat
to submit the patch. :-)  I was not aware IRA did that and just assumed
it was a bug.  For now, consider the patch withdrawn.  That said, I
will have to look at that cost computation, especially since when I
last looked, IRA does not count potential prologue/epilogue save/restore
insns if it were to assign a non-volatile register when computing reg
costs.  That would seem to be an important consideration here.

I'll note this all came about when I was looking at PR96017, which is
due to not shrink-wrapping a pseudo.  That was due to it being live
across a call.  I first I thought (for the 2nd test case, not the original
one) split_live_ranges_for_shrink_wrap() was nicely splitting the pseudo
for us, but it must have been the caller-saves algorithm you mention above.
However, that code doesn't handle the original test case, which I think
it should.

My thought for that bug was to introduce some extra splitting before RA
(maybe as part of split_live_ranges_for_shrink_wrap?) where we split
pseudos that are live across a call, but that have at least one path
to the exit that doesn't cross a call.  However, if we can beef up
the caller-saves cost computation, maybe that would work too?


Peter
Li, Pan2 via Gcc-patches Aug. 25, 2020, 8:35 p.m. UTC | #6
On Mon, 2020-08-10 at 11:11 -0500, Peter Bergner wrote:
> On 7/23/20 3:29 PM, Jeff Law wrote:
> > > > What's driving this change?
> > > 
> > > Peter noticed IRA allocates stuff to volatile registers while it is life
> > > through a call, and then LRA has to correct that, not optimal.
> > I can't recall if IRA or LRA handles the need to save them to the stack, but the
> > whole point is you can do much better sometimes by saving into the stack with the
> > caller-saves algorithm vs just giving up and spilling.
> > 
> > > > IRA will do a cost/benefit analysis to see using call clobbered registers like
> > > > this is profitable or not.  You're just turning the whole thing off.
> 
> Sorry for taking so long to reply.  I'm the guilty party for asking Pat
> to submit the patch. :-)  I was not aware IRA did that and just assumed
> it was a bug.  For now, consider the patch withdrawn.  That said, I
> will have to look at that cost computation, especially since when I
> last looked, IRA does not count potential prologue/epilogue save/restore
> insns if it were to assign a non-volatile register when computing reg
> costs.  That would seem to be an important consideration here.
No worries.  Yea, you want to count the prologue/epilogue, as well as the
saves/restores at the call points  (which need frequency scaling and accounting
for saves which don't need to happen because a prior save is sufficient to cover
more than one call), etc.

I think much of this code is still in caller-save.c.  It's been eons since I
worked on it, but I can probably get reacquainted with the implementation if you
have questions



> I'll note this all came about when I was looking at PR96017, which is
> due to not shrink-wrapping a pseudo.  That was due to it being live
> across a call.  I first I thought (for the 2nd test case, not the original
> one) split_live_ranges_for_shrink_wrap() was nicely splitting the pseudo
> for us, but it must have been the caller-saves algorithm you mention above.
> However, that code doesn't handle the original test case, which I think
> it should.
> 
> My thought for that bug was to introduce some extra splitting before RA
> (maybe as part of split_live_ranges_for_shrink_wrap?) where we split
> pseudos that are live across a call, but that have at least one path
> to the exit that doesn't cross a call.  However, if we can beef up
> the caller-saves cost computation, maybe that would work too?
I've gone back and forth on pre allocation splitting as well as post-allocating
splitting and re-allocation.   I could argue either side of that discussion --
given we've got a bit of special code for splitting to help shrink wrapping,
maybe that's the best place if we need to do splitting before RA since this was
triggered by digging into a shrink-wrapping problem.

I'd probably start by making sure the cost computation is sane though.

Jeff
Segher Boessenkool Aug. 26, 2020, 8:58 p.m. UTC | #7
On Tue, Aug 25, 2020 at 02:35:51PM -0600, Jeff Law wrote:
> I've gone back and forth on pre allocation splitting as well as post-allocating
> splitting and re-allocation.   I could argue either side of that discussion --

If you end up wanting something split it is best to do it early.  But if
you cannot predict well if you want it split eventually, there is no
good way to do this afaics :-(

> given we've got a bit of special code for splitting to help shrink wrapping,

Only very minimal stuff for splitting copies from a hard reg (argument reg)
to a pseudo.  And then only for all together.

> I'd probably start by making sure the cost computation is sane though.

Yeah.


Segher
Li, Pan2 via Gcc-patches Aug. 26, 2020, 9:10 p.m. UTC | #8
On Wed, 2020-08-26 at 15:58 -0500, Segher Boessenkool wrote:
> On Tue, Aug 25, 2020 at 02:35:51PM -0600, Jeff Law wrote:
> > I've gone back and forth on pre allocation splitting as well as post-allocating
> > splitting and re-allocation.   I could argue either side of that discussion --
> 
> If you end up wanting something split it is best to do it early. 
The problem with early splitting is you don't know if you want to split or not.
 You're making an educated guess how the split will interact with register
allocation.  If you guess wrong, often the allocator will save you and assign the
split pseudos to the same hard reg, but not always.  Been there more than once.



>  But if
> you cannot predict well if you want it split eventually, there is no
> good way to do this afaics :-(
Precisely.  In the old LRS bits (Meissner with a little help from me, circa 1995)
we actually recorded the original pseudo as well as all its split versions in a
new RTL node.  We could then undo the splitting if it wasn't helpful.  IIRC we
also use that node to improve debugging of objects that were subject to LRS.

It worked reasonably well, but was fairly hard to maintain.  I think we managed
to drop in some of the infrastructure into EGCS and shipped to to our customer,
but I don't think we ever got the main pass submitted.  Eventually it got dropped
as it was fairly painful to maintain.

jeff
diff mbox series

Patch

diff --git a/gcc/common/config/rs6000/rs6000-common.c
b/gcc/common/config/rs6000/rs6000-common.c
index ee37b9dc90b..6d68834aed2 100644
--- a/gcc/common/config/rs6000/rs6000-common.c
+++ b/gcc/common/config/rs6000/rs6000-common.c
@@ -43,8 +43,13 @@  static const struct default_options
rs6000_option_optimization_table[] =
        on rs6000, turn it off by default.  */
     { OPT_LEVELS_ALL, OPT_frename_registers, NULL, 0 },

+    /* Don't allow IRA to use volatile regs across function calls.  */
+    { OPT_LEVELS_ALL, OPT_fcaller_saves, NULL, 0 },
+
     /* Double growth factor to counter reduced min jump length.  */
     { OPT_LEVELS_ALL, OPT__param_max_grow_copy_bb_insns_, NULL, 16 },
+
+    /* Following marks the end of the list and needs to remain last.  */
     { OPT_LEVELS_NONE, 0, NULL, 0 }
   };

diff --git a/gcc/testsuite/gcc.target/powerpc/caller-saves.c
b/gcc/testsuite/gcc.target/powerpc/caller-saves.c
new file mode 100644
index 00000000000..846356f16f7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/caller-saves.c
@@ -0,0 +1,9 @@ 
+/* { dg-do compile */
+/* { dg-options "-O2 -fverbose-asm" } */
+
+void
+foo ()
+{
+}
+
+/* { dg-final { scan-assembler-not "fcaller-saves" } } */