Message ID | DB6PR0801MB2053393718EE4BAE98915F00835C0@DB6PR0801MB2053.eurprd08.prod.outlook.com |
---|---|
State | New |
Headers | show |
Series | [AArch64] Set default sched pressure algorithm | expand |
On Fri, Nov 3, 2017 at 12:11 AM, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote: > The Arm backend sets the default sched-pressure algorithm to > SCHED_PRESSURE_MODEL. Benchmarking on AArch64 shows this > speeds up floating point performance on SPEC - eg. CactusBSSN improves > by ~16%. The gains are mostly due to less spilling, so enable this on AArch64 > by default. > > OK for commit? I am ok with this from my point of view. The rs6000, arm and s390 back-ends all enable the same way. I suspect all RISC targets should enable this way too. Thanks, Andrew > > 2017-11-02 Wilco Dijkstra <wdijkstr@arm.com> > > * config/aarch64/aarch64.c (aarch64_override_options_internal): > Set PARAM_SCHED_PRESSURE_ALGORITHM to SCHED_PRESSURE_MODEL. > > -- > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index 34456e96497ac7b6d2f9931187ff05619e1934a4..750b0bc29c0963742d5d7bb4ae4619d93bec3e4a 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -9276,6 +9276,11 @@ aarch64_override_options_internal (struct gcc_options *opts) > opts->x_param_values, > global_options_set.x_param_values); > > + /* Use the alternative scheduling-pressure algorithm by default. */ > + maybe_set_param_value (PARAM_SCHED_PRESSURE_ALGORITHM, SCHED_PRESSURE_MODEL, > + opts->x_param_values, > + global_options_set.x_param_values); > + > /* Enable sw prefetching at specified optimization level for > CPUS that have prefetch. Lower optimization level threshold by 1 > when profiling is enabled. */ <div id="DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2"><br /> <table style="border-top: 1px solid #D3D4DE;"> <tr> <td style="width: 55px; padding-top: 13px;"><a href="http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail" target="_blank"><img src="https://ipmcdn.avast.com/images/icons/icon-envelope-tick-green-avg-v1.png" alt="" width="46" height="29" style="width: 46px; height: 29px;" /></a></td> <td style="width: 470px; padding-top: 12px; color: #41424e; font-size: 13px; font-family: Arial, Helvetica, sans-serif; line-height: 18px;">Virus-free. <a href="http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail" target="_blank" style="color: #4453ea;">www.avg.com</a> </td> </tr> </table><a href="#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2" width="1" height="1"></a></div>
On Fri, Nov 3, 2017 at 6:38 AM, Andrew Pinski <apinski@cavium.com> wrote: > On Fri, Nov 3, 2017 at 12:11 AM, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote: >> The Arm backend sets the default sched-pressure algorithm to >> SCHED_PRESSURE_MODEL. Benchmarking on AArch64 shows this >> speeds up floating point performance on SPEC - eg. CactusBSSN improves >> by ~16%. The gains are mostly due to less spilling, so enable this on AArch64 >> by default. >> >> OK for commit? > > I am ok with this from my point of view. The rs6000, arm and s390 > back-ends all enable the same way. I suspect all RISC targets should > enable this way too. I think all OOO execution capable CPUs should. Ideally this wouldn't be a choice between two models but the scheduler would take into account register pressure anyways. Or we should always schedule with sched-pressure during first scheduling. Richard. > Thanks, > Andrew > >> >> 2017-11-02 Wilco Dijkstra <wdijkstr@arm.com> >> >> * config/aarch64/aarch64.c (aarch64_override_options_internal): >> Set PARAM_SCHED_PRESSURE_ALGORITHM to SCHED_PRESSURE_MODEL. >> >> -- >> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c >> index 34456e96497ac7b6d2f9931187ff05619e1934a4..750b0bc29c0963742d5d7bb4ae4619d93bec3e4a 100644 >> --- a/gcc/config/aarch64/aarch64.c >> +++ b/gcc/config/aarch64/aarch64.c >> @@ -9276,6 +9276,11 @@ aarch64_override_options_internal (struct gcc_options *opts) >> opts->x_param_values, >> global_options_set.x_param_values); >> >> + /* Use the alternative scheduling-pressure algorithm by default. */ >> + maybe_set_param_value (PARAM_SCHED_PRESSURE_ALGORITHM, SCHED_PRESSURE_MODEL, >> + opts->x_param_values, >> + global_options_set.x_param_values); >> + >> /* Enable sw prefetching at specified optimization level for >> CPUS that have prefetch. Lower optimization level threshold by 1 >> when profiling is enabled. */ > <div id="DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2"><br /> > <table style="border-top: 1px solid #D3D4DE;"> > <tr> > <td style="width: 55px; padding-top: 13px;"><a > href="http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail" > target="_blank"><img > src="https://ipmcdn.avast.com/images/icons/icon-envelope-tick-green-avg-v1.png" > alt="" width="46" height="29" style="width: 46px; height: 29px;" > /></a></td> > <td style="width: 470px; padding-top: 12px; color: #41424e; > font-size: 13px; font-family: Arial, Helvetica, sans-serif; > line-height: 18px;">Virus-free. <a > href="http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail" > target="_blank" style="color: #4453ea;">www.avg.com</a> > </td> > </tr> > </table><a href="#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2" width="1" > height="1"></a></div>
Richard Biener wrote: > On Fri, Nov 3, 2017 at 6:38 AM, Andrew Pinski <apinski@cavium.com> wrote: > > On Fri, Nov 3, 2017 at 12:11 AM, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote: > >> The Arm backend sets the default sched-pressure algorithm to > >> SCHED_PRESSURE_MODEL. Benchmarking on AArch64 shows this > >> speeds up floating point performance on SPEC - eg. CactusBSSN improves > >> by ~16%. The gains are mostly due to less spilling, so enable this on AArch64 > >> by default. > >> >>> OK for commit? > > > > I am ok with this from my point of view. The rs6000, arm and s390 > > back-ends all enable the same way. I suspect all RISC targets should > > enable this way too. > > I think all OOO execution capable CPUs should. Ideally this wouldn't be > a choice between two models but the scheduler would take into account > register pressure anyways. Or we should always schedule with sched-pressure > during first scheduling. Of the 6 targets which use -fsched-pressure, 5 prefer SCHED_PRESSURE_MODEL, so we could just make that the default (nds32 is the only exception, but it has 32 registers so that should not be an issue). This also fits nicely with my patches to improve GCC settings to be more optimal. Wilco
On Thu, Nov 02, 2017 at 06:41:58PM +0000, Wilco Dijkstra wrote: > The Arm backend sets the default sched-pressure algorithm to > SCHED_PRESSURE_MODEL. Benchmarking on AArch64 shows this > speeds up floating point performance on SPEC - eg. CactusBSSN improves > by ~16%. The gains are mostly due to less spilling, so enable this on AArch64 > by default. > > OK for commit? OK. Reviewed-By: James Greenhalgh <james.greenhalgh@arm.com> Thanks, James > > 2017-11-02 Wilco Dijkstra <wdijkstr@arm.com> > > * config/aarch64/aarch64.c (aarch64_override_options_internal): > Set PARAM_SCHED_PRESSURE_ALGORITHM to SCHED_PRESSURE_MODEL. > > -- > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index 34456e96497ac7b6d2f9931187ff05619e1934a4..750b0bc29c0963742d5d7bb4ae4619d93bec3e4a 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -9276,6 +9276,11 @@ aarch64_override_options_internal (struct gcc_options *opts) > opts->x_param_values, > global_options_set.x_param_values); > > + /* Use the alternative scheduling-pressure algorithm by default. */ > + maybe_set_param_value (PARAM_SCHED_PRESSURE_ALGORITHM, SCHED_PRESSURE_MODEL, > + opts->x_param_values, > + global_options_set.x_param_values); > + > /* Enable sw prefetching at specified optimization level for > CPUS that have prefetch. Lower optimization level threshold by 1 > when profiling is enabled. */
> > On Nov 2, 2017, at 9:41 PM, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote: > > The Arm backend sets the default sched-pressure algorithm to > SCHED_PRESSURE_MODEL. Benchmarking on AArch64 shows this > speeds up floating point performance on SPEC - eg. CactusBSSN improves > by ~16%. The gains are mostly due to less spilling, so enable this on AArch64 > by default. Hi Wilco, Any notable regressions? > > OK for commit? Looks good to me. -- Maxim Kuvyrkov www.linaro.org > > 2017-11-02 Wilco Dijkstra <wdijkstr@arm.com> > > * config/aarch64/aarch64.c (aarch64_override_options_internal): > Set PARAM_SCHED_PRESSURE_ALGORITHM to SCHED_PRESSURE_MODEL. > > -- > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index 34456e96497ac7b6d2f9931187ff05619e1934a4..750b0bc29c0963742d5d7bb4ae4619d93bec3e4a 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -9276,6 +9276,11 @@ aarch64_override_options_internal (struct gcc_options *opts) > opts->x_param_values, > global_options_set.x_param_values); > > + /* Use the alternative scheduling-pressure algorithm by default. */ > + maybe_set_param_value (PARAM_SCHED_PRESSURE_ALGORITHM, SCHED_PRESSURE_MODEL, > + opts->x_param_values, > + global_options_set.x_param_values); > + > /* Enable sw prefetching at specified optimization level for > CPUS that have prefetch. Lower optimization level threshold by 1 > when profiling is enabled. */
Maxim Kuvyrkov wrote: > > On Nov 2, 2017, at 9:41 PM, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote: > > > > The Arm backend sets the default sched-pressure algorithm to > > SCHED_PRESSURE_MODEL. Benchmarking on AArch64 shows this > > speeds up floating point performance on SPEC - eg. CactusBSSN improves > > by ~16%. The gains are mostly due to less spilling, so enable this on AArch64 > > by default. > > Hi Wilco, > > Any notable regressions? No, nothing that stands out. There were a few regressions on Cortex-A57 but none reproduced on Cortex-A72, so they are not real regressions. The gains do reproduce and instruction counts are lower (most binaries show a significant reduction in spills). Wilco
On 6 November 2017 at 14:05, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote: > Maxim Kuvyrkov wrote: >> > On Nov 2, 2017, at 9:41 PM, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote: >> > >> > The Arm backend sets the default sched-pressure algorithm to >> > SCHED_PRESSURE_MODEL. Benchmarking on AArch64 shows this >> > speeds up floating point performance on SPEC - eg. CactusBSSN improves >> > by ~16%. The gains are mostly due to less spilling, so enable this on AArch64 >> > by default. >> >> Hi Wilco, >> >> Any notable regressions? > > No, nothing that stands out. There were a few regressions on Cortex-A57 but none > reproduced on Cortex-A72, so they are not real regressions. The gains do reproduce > and instruction counts are lower (most binaries show a significant reduction in spills). > > Wilco Hi Wilco, After you committed this patch (r254378), I noticed a few regressions: FAIL: gcc.target/aarch64/subs_compare_1.c scan-assembler-not cmp\\tw[0-9]+, w[0-9]+ FAIL: gcc.target/aarch64/subs_compare_1.c scan-assembler-times subs\\tw[0-9]+, w[0-9]+, w[0-9]+ 1 (found 0 times) FAIL: gcc.target/aarch64/subs_compare_2.c scan-assembler-times subs\\tw[0-9]+, w[0-9]+, #4 1 (found 0 times) I'm still catching-up, so maybe you already fixed this in a subsequent commit? (It's still failing in my most recent validation to date, r254467). Thanks, Christophe
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 34456e96497ac7b6d2f9931187ff05619e1934a4..750b0bc29c0963742d5d7bb4ae4619d93bec3e4a 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -9276,6 +9276,11 @@ aarch64_override_options_internal (struct gcc_options *opts) opts->x_param_values, global_options_set.x_param_values); + /* Use the alternative scheduling-pressure algorithm by default. */ + maybe_set_param_value (PARAM_SCHED_PRESSURE_ALGORITHM, SCHED_PRESSURE_MODEL, + opts->x_param_values, + global_options_set.x_param_values); + /* Enable sw prefetching at specified optimization level for CPUS that have prefetch. Lower optimization level threshold by 1 when profiling is enabled. */