diff mbox series

[AArch64] Set default sched pressure algorithm

Message ID DB6PR0801MB2053393718EE4BAE98915F00835C0@DB6PR0801MB2053.eurprd08.prod.outlook.com
State New
Headers show
Series [AArch64] Set default sched pressure algorithm | expand

Commit Message

Wilco Dijkstra Nov. 2, 2017, 6:41 p.m. UTC
The Arm backend sets the default sched-pressure algorithm to
SCHED_PRESSURE_MODEL.  Benchmarking on AArch64 shows this 
speeds up floating point performance on SPEC - eg. CactusBSSN improves
by ~16%.  The gains are mostly due to less spilling, so enable this on AArch64
by default.

OK for commit?

2017-11-02  Wilco Dijkstra  <wdijkstr@arm.com>

	* config/aarch64/aarch64.c (aarch64_override_options_internal):
	Set PARAM_SCHED_PRESSURE_ALGORITHM to SCHED_PRESSURE_MODEL.

--

Comments

Andrew Pinski Nov. 3, 2017, 5:38 a.m. UTC | #1
On Fri, Nov 3, 2017 at 12:11 AM, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:
> The Arm backend sets the default sched-pressure algorithm to
> SCHED_PRESSURE_MODEL.  Benchmarking on AArch64 shows this
> speeds up floating point performance on SPEC - eg. CactusBSSN improves
> by ~16%.  The gains are mostly due to less spilling, so enable this on AArch64
> by default.
>
> OK for commit?

I am ok with this from my point of view.  The rs6000, arm and s390
back-ends all enable the same way.  I suspect all RISC targets should
enable this way too.

Thanks,
Andrew

>
> 2017-11-02  Wilco Dijkstra  <wdijkstr@arm.com>
>
>         * config/aarch64/aarch64.c (aarch64_override_options_internal):
>         Set PARAM_SCHED_PRESSURE_ALGORITHM to SCHED_PRESSURE_MODEL.
>
> --
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 34456e96497ac7b6d2f9931187ff05619e1934a4..750b0bc29c0963742d5d7bb4ae4619d93bec3e4a 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -9276,6 +9276,11 @@ aarch64_override_options_internal (struct gcc_options *opts)
>                            opts->x_param_values,
>                            global_options_set.x_param_values);
>
> +  /* Use the alternative scheduling-pressure algorithm by default.  */
> +  maybe_set_param_value (PARAM_SCHED_PRESSURE_ALGORITHM, SCHED_PRESSURE_MODEL,
> +                        opts->x_param_values,
> +                        global_options_set.x_param_values);
> +
>    /* Enable sw prefetching at specified optimization level for
>       CPUS that have prefetch.  Lower optimization level threshold by 1
>       when profiling is enabled.  */
<div id="DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2"><br />
<table style="border-top: 1px solid #D3D4DE;">
	<tr>
        <td style="width: 55px; padding-top: 13px;"><a
href="http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail"
target="_blank"><img
src="https://ipmcdn.avast.com/images/icons/icon-envelope-tick-green-avg-v1.png"
alt="" width="46" height="29" style="width: 46px; height: 29px;"
/></a></td>
		<td style="width: 470px; padding-top: 12px; color: #41424e;
font-size: 13px; font-family: Arial, Helvetica, sans-serif;
line-height: 18px;">Virus-free. <a
href="http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail"
target="_blank" style="color: #4453ea;">www.avg.com</a>
		</td>
	</tr>
</table><a href="#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2" width="1"
height="1"></a></div>
Richard Biener Nov. 3, 2017, 10:11 a.m. UTC | #2
On Fri, Nov 3, 2017 at 6:38 AM, Andrew Pinski <apinski@cavium.com> wrote:
> On Fri, Nov 3, 2017 at 12:11 AM, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:
>> The Arm backend sets the default sched-pressure algorithm to
>> SCHED_PRESSURE_MODEL.  Benchmarking on AArch64 shows this
>> speeds up floating point performance on SPEC - eg. CactusBSSN improves
>> by ~16%.  The gains are mostly due to less spilling, so enable this on AArch64
>> by default.
>>
>> OK for commit?
>
> I am ok with this from my point of view.  The rs6000, arm and s390
> back-ends all enable the same way.  I suspect all RISC targets should
> enable this way too.

I think all OOO execution capable CPUs should.  Ideally this wouldn't be
a choice between two models but the scheduler would take into account
register pressure anyways.  Or we should always schedule with sched-pressure
during first scheduling.

Richard.

> Thanks,
> Andrew
>
>>
>> 2017-11-02  Wilco Dijkstra  <wdijkstr@arm.com>
>>
>>         * config/aarch64/aarch64.c (aarch64_override_options_internal):
>>         Set PARAM_SCHED_PRESSURE_ALGORITHM to SCHED_PRESSURE_MODEL.
>>
>> --
>> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
>> index 34456e96497ac7b6d2f9931187ff05619e1934a4..750b0bc29c0963742d5d7bb4ae4619d93bec3e4a 100644
>> --- a/gcc/config/aarch64/aarch64.c
>> +++ b/gcc/config/aarch64/aarch64.c
>> @@ -9276,6 +9276,11 @@ aarch64_override_options_internal (struct gcc_options *opts)
>>                            opts->x_param_values,
>>                            global_options_set.x_param_values);
>>
>> +  /* Use the alternative scheduling-pressure algorithm by default.  */
>> +  maybe_set_param_value (PARAM_SCHED_PRESSURE_ALGORITHM, SCHED_PRESSURE_MODEL,
>> +                        opts->x_param_values,
>> +                        global_options_set.x_param_values);
>> +
>>    /* Enable sw prefetching at specified optimization level for
>>       CPUS that have prefetch.  Lower optimization level threshold by 1
>>       when profiling is enabled.  */
> <div id="DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2"><br />
> <table style="border-top: 1px solid #D3D4DE;">
>         <tr>
>         <td style="width: 55px; padding-top: 13px;"><a
> href="http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail"
> target="_blank"><img
> src="https://ipmcdn.avast.com/images/icons/icon-envelope-tick-green-avg-v1.png"
> alt="" width="46" height="29" style="width: 46px; height: 29px;"
> /></a></td>
>                 <td style="width: 470px; padding-top: 12px; color: #41424e;
> font-size: 13px; font-family: Arial, Helvetica, sans-serif;
> line-height: 18px;">Virus-free. <a
> href="http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail"
> target="_blank" style="color: #4453ea;">www.avg.com</a>
>                 </td>
>         </tr>
> </table><a href="#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2" width="1"
> height="1"></a></div>
Wilco Dijkstra Nov. 3, 2017, 12:48 p.m. UTC | #3
Richard Biener wrote:
> On Fri, Nov 3, 2017 at 6:38 AM, Andrew Pinski <apinski@cavium.com> wrote:
> > On Fri, Nov 3, 2017 at 12:11 AM, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:
> >> The Arm backend sets the default sched-pressure algorithm to
> >> SCHED_PRESSURE_MODEL.  Benchmarking on AArch64 shows this
> >> speeds up floating point performance on SPEC - eg. CactusBSSN improves
> >> by ~16%.  The gains are mostly due to less spilling, so enable this on AArch64
> >> by default.
> >>
>>> OK for commit?
> >
> > I am ok with this from my point of view.  The rs6000, arm and s390
> > back-ends all enable the same way.  I suspect all RISC targets should
> > enable this way too.
>
> I think all OOO execution capable CPUs should.  Ideally this wouldn't be
> a choice between two models but the scheduler would take into account
> register pressure anyways.  Or we should always schedule with sched-pressure
> during first scheduling.

Of the 6 targets which use -fsched-pressure, 5 prefer SCHED_PRESSURE_MODEL,
so we could just make that the default (nds32 is the only exception, but it
has 32 registers so that should not be an issue).

This also fits nicely with my patches to improve GCC settings to be more optimal.

Wilco
James Greenhalgh Nov. 3, 2017, 3:08 p.m. UTC | #4
On Thu, Nov 02, 2017 at 06:41:58PM +0000, Wilco Dijkstra wrote:
> The Arm backend sets the default sched-pressure algorithm to
> SCHED_PRESSURE_MODEL.  Benchmarking on AArch64 shows this 
> speeds up floating point performance on SPEC - eg. CactusBSSN improves
> by ~16%.  The gains are mostly due to less spilling, so enable this on AArch64
> by default.
> 
> OK for commit?

OK.

Reviewed-By: James Greenhalgh <james.greenhalgh@arm.com>

Thanks,
James

> 
> 2017-11-02  Wilco Dijkstra  <wdijkstr@arm.com>
> 
> 	* config/aarch64/aarch64.c (aarch64_override_options_internal):
> 	Set PARAM_SCHED_PRESSURE_ALGORITHM to SCHED_PRESSURE_MODEL.
> 
> --
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 34456e96497ac7b6d2f9931187ff05619e1934a4..750b0bc29c0963742d5d7bb4ae4619d93bec3e4a 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -9276,6 +9276,11 @@ aarch64_override_options_internal (struct gcc_options *opts)
>  			   opts->x_param_values,
>  			   global_options_set.x_param_values);
>  
> +  /* Use the alternative scheduling-pressure algorithm by default.  */
> +  maybe_set_param_value (PARAM_SCHED_PRESSURE_ALGORITHM, SCHED_PRESSURE_MODEL,
> +			 opts->x_param_values,
> +			 global_options_set.x_param_values);
> +
>    /* Enable sw prefetching at specified optimization level for
>       CPUS that have prefetch.  Lower optimization level threshold by 1
>       when profiling is enabled.  */
Maxim Kuvyrkov Nov. 6, 2017, 10:31 a.m. UTC | #5
> 
> On Nov 2, 2017, at 9:41 PM, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:
> 
> The Arm backend sets the default sched-pressure algorithm to
> SCHED_PRESSURE_MODEL.  Benchmarking on AArch64 shows this 
> speeds up floating point performance on SPEC - eg. CactusBSSN improves
> by ~16%.  The gains are mostly due to less spilling, so enable this on AArch64
> by default.

Hi Wilco,

Any notable regressions?

> 
> OK for commit?

Looks good to me.

--
Maxim Kuvyrkov
www.linaro.org

> 
> 2017-11-02  Wilco Dijkstra  <wdijkstr@arm.com>
> 
> 	* config/aarch64/aarch64.c (aarch64_override_options_internal):
> 	Set PARAM_SCHED_PRESSURE_ALGORITHM to SCHED_PRESSURE_MODEL.
> 
> --
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 34456e96497ac7b6d2f9931187ff05619e1934a4..750b0bc29c0963742d5d7bb4ae4619d93bec3e4a 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -9276,6 +9276,11 @@ aarch64_override_options_internal (struct gcc_options *opts)
> 			   opts->x_param_values,
> 			   global_options_set.x_param_values);
> 
> +  /* Use the alternative scheduling-pressure algorithm by default.  */
> +  maybe_set_param_value (PARAM_SCHED_PRESSURE_ALGORITHM, SCHED_PRESSURE_MODEL,
> +			 opts->x_param_values,
> +			 global_options_set.x_param_values);
> +
>   /* Enable sw prefetching at specified optimization level for
>      CPUS that have prefetch.  Lower optimization level threshold by 1
>      when profiling is enabled.  */
Wilco Dijkstra Nov. 6, 2017, 1:05 p.m. UTC | #6
Maxim Kuvyrkov wrote:
> > On Nov 2, 2017, at 9:41 PM, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:
> > 
> > The Arm backend sets the default sched-pressure algorithm to
> > SCHED_PRESSURE_MODEL.  Benchmarking on AArch64 shows this 
> > speeds up floating point performance on SPEC - eg. CactusBSSN improves
> > by ~16%.  The gains are mostly due to less spilling, so enable this on AArch64
> > by default.
>
> Hi Wilco,
>
> Any notable regressions?

No, nothing that stands out. There were a few regressions on Cortex-A57 but none
reproduced on Cortex-A72, so they are not real regressions. The gains do reproduce
and instruction counts are lower (most binaries show a significant reduction in spills).

Wilco
Christophe Lyon Nov. 13, 2017, 1:40 p.m. UTC | #7
On 6 November 2017 at 14:05, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:
> Maxim Kuvyrkov wrote:
>> > On Nov 2, 2017, at 9:41 PM, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:
>> >
>> > The Arm backend sets the default sched-pressure algorithm to
>> > SCHED_PRESSURE_MODEL.  Benchmarking on AArch64 shows this
>> > speeds up floating point performance on SPEC - eg. CactusBSSN improves
>> > by ~16%.  The gains are mostly due to less spilling, so enable this on AArch64
>> > by default.
>>
>> Hi Wilco,
>>
>> Any notable regressions?
>
> No, nothing that stands out. There were a few regressions on Cortex-A57 but none
> reproduced on Cortex-A72, so they are not real regressions. The gains do reproduce
> and instruction counts are lower (most binaries show a significant reduction in spills).
>
> Wilco

Hi Wilco,

After you committed this patch (r254378), I noticed a few regressions:
FAIL:    gcc.target/aarch64/subs_compare_1.c scan-assembler-not
cmp\\tw[0-9]+, w[0-9]+
FAIL:    gcc.target/aarch64/subs_compare_1.c scan-assembler-times
subs\\tw[0-9]+, w[0-9]+, w[0-9]+ 1 (found 0 times)
FAIL:    gcc.target/aarch64/subs_compare_2.c scan-assembler-times
subs\\tw[0-9]+, w[0-9]+, #4 1 (found 0 times)

I'm still catching-up, so maybe you already fixed this in a subsequent commit?
(It's still failing in my most recent validation to date, r254467).

Thanks,

Christophe
diff mbox series

Patch

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 34456e96497ac7b6d2f9931187ff05619e1934a4..750b0bc29c0963742d5d7bb4ae4619d93bec3e4a 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -9276,6 +9276,11 @@  aarch64_override_options_internal (struct gcc_options *opts)
 			   opts->x_param_values,
 			   global_options_set.x_param_values);
 
+  /* Use the alternative scheduling-pressure algorithm by default.  */
+  maybe_set_param_value (PARAM_SCHED_PRESSURE_ALGORITHM, SCHED_PRESSURE_MODEL,
+			 opts->x_param_values,
+			 global_options_set.x_param_values);
+
   /* Enable sw prefetching at specified optimization level for
      CPUS that have prefetch.  Lower optimization level threshold by 1
      when profiling is enabled.  */