Message ID | 20170512093836.19942-1-andi@firstfloor.org |
---|---|
State | New |
Headers | show |
On Fri, May 12, 2017 at 11:38 AM, Andi Kleen <andi@firstfloor.org> wrote: > From: Andi Kleen <ak@linux.intel.com> > > With high -j parallelism the autofdo tests can randomly fail. > autofdo uses Linux perf to record profiling data. > Linux perf uses a locked perf buffer. By default it has > around 516k buffer per uid (/proc/sys/kernel/perf_event_mlock_kb). > > An individual perf record tries to grab the full 516k, > which makes parallel perf record fail. > > This patch limits the perf buffer for individual perf record to 8k. > With the default settings this allows a parallelism of the test > cases of 16, which is hopefully good enough > > (if not would need to add some kind of semaphore, or ask > the user to increase the limit as root) > > I also removed an unneeded -o perf.data option > > Thanks to Marcin to finally spotting the problem. > > Passes bootstrap and test on x86_64-linux. Ok for trunk? Ok. Can you retain -o perf.data (even if that's the default)? Thanks, Richard. > gcc/testsuite/: > > 2017-05-12 Andi Kleen <ak@linux.intel.com> > > PR testsuite/77684 > * lib/target-supports.exp (profopt-perf-wrapper): Remove > -p perf.data option. Add -m8 option. > --- > gcc/testsuite/lib/target-supports.exp | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp > index 83e7f2670e6..a22657767db 100644 > --- a/gcc/testsuite/lib/target-supports.exp > +++ b/gcc/testsuite/lib/target-supports.exp > @@ -522,9 +522,16 @@ proc check_effective_target_keeps_null_pointer_checks { } { > > # Return the autofdo profile wrapper > > +# Linux by default allows 516KB of perf event buffers > +# in /proc/sys/kernel/perfe_event_mlock_kb > +# Each individual perf tries to grab it > +# This causes problems with parallel test suite runs. Instead > +# limit us to 8 pages (32K), which should be good enough > +# for the small test programs. With the default settings > +# this allows parallelism of 16 and higher of parallel gcc-auto-profile > proc profopt-perf-wrapper { } { > global srcdir > - return "$srcdir/../config/i386/gcc-auto-profile -o perf.data " > + return "$srcdir/../config/i386/gcc-auto-profile -m8 " > } > > # Return true if profiling is supported on the target. > -- > 2.12.2 >
On 05/12/2017 03:38 AM, Andi Kleen wrote: > From: Andi Kleen <ak@linux.intel.com> > > With high -j parallelism the autofdo tests can randomly fail. > autofdo uses Linux perf to record profiling data. > Linux perf uses a locked perf buffer. By default it has > around 516k buffer per uid (/proc/sys/kernel/perf_event_mlock_kb). > > An individual perf record tries to grab the full 516k, > which makes parallel perf record fail. > > This patch limits the perf buffer for individual perf record to 8k. > With the default settings this allows a parallelism of the test > cases of 16, which is hopefully good enough I'm routinely building & testing with -j56 these days. We'll see if it's sufficient. > > (if not would need to add some kind of semaphore, or ask > the user to increase the limit as root) > > I also removed an unneeded -o perf.data option > > Thanks to Marcin to finally spotting the problem. > > Passes bootstrap and test on x86_64-linux. Ok for trunk? > > gcc/testsuite/: > > 2017-05-12 Andi Kleen <ak@linux.intel.com> > > PR testsuite/77684 > * lib/target-supports.exp (profopt-perf-wrapper): Remove > -p perf.data option. Add -m8 option. OK. But be aware that it may not ultimately be sufficient. jeff
Andi Kleen <andi@firstfloor.org> writes: > From: Andi Kleen <ak@linux.intel.com> > > With high -j parallelism the autofdo tests can randomly fail. > autofdo uses Linux perf to record profiling data. > Linux perf uses a locked perf buffer. By default it has > around 516k buffer per uid (/proc/sys/kernel/perf_event_mlock_kb). > > An individual perf record tries to grab the full 516k, > which makes parallel perf record fail. > > This patch limits the perf buffer for individual perf record to 8k. > With the default settings this allows a parallelism of the test > cases of 16, which is hopefully good enough So for -jN > 16 it would silently fail again? I think we should warn when the -jN is sufficiently large such that tests will randomly fail, and perhaps suggest workarounds with ulimit/etc. Aldy
On Thu, May 18, 2017 at 10:35 AM, Aldy Hernandez <aldyh@redhat.com> wrote: > Andi Kleen <andi@firstfloor.org> writes: > >> From: Andi Kleen <ak@linux.intel.com> >> >> With high -j parallelism the autofdo tests can randomly fail. >> autofdo uses Linux perf to record profiling data. >> Linux perf uses a locked perf buffer. By default it has >> around 516k buffer per uid (/proc/sys/kernel/perf_event_mlock_kb). >> >> An individual perf record tries to grab the full 516k, >> which makes parallel perf record fail. >> >> This patch limits the perf buffer for individual perf record to 8k. >> With the default settings this allows a parallelism of the test >> cases of 16, which is hopefully good enough > > So for -jN > 16 it would silently fail again? > > I think we should warn when the -jN is sufficiently large such that > tests will randomly fail, and perhaps suggest workarounds with > ulimit/etc. given that make check parallelism is somewhat "explicit" can't we simply arrange for the chunks to never get more than, say, 4 tree-prof.exp testcases run at the same time? Richard. > Aldy
On May 18, 2017, at 1:35 AM, Aldy Hernandez <aldyh@redhat.com> wrote: > > Andi Kleen <andi@firstfloor.org> writes: > >> From: Andi Kleen <ak@linux.intel.com> >> >> With high -j parallelism the autofdo tests can randomly fail. >> autofdo uses Linux perf to record profiling data. >> Linux perf uses a locked perf buffer. By default it has >> around 516k buffer per uid (/proc/sys/kernel/perf_event_mlock_kb). >> >> An individual perf record tries to grab the full 516k, >> which makes parallel perf record fail. >> >> This patch limits the perf buffer for individual perf record to 8k. >> With the default settings this allows a parallelism of the test >> cases of 16, which is hopefully good enough > > So for -jN > 16 it would silently fail again? > > I think we should warn when the -jN is sufficiently large such that > tests will randomly fail, and perhaps suggest workarounds with > ulimit/etc. Not a big fan of warning. I'd rather smell the max, and divide by n, or limit them to 4 (or 16 or any fix constant) and pass appropriate arguments or anything else that just fixes it.
On 05/18/2017 12:09 PM, Mike Stump wrote: > On May 18, 2017, at 1:35 AM, Aldy Hernandez <aldyh@redhat.com> wrote: >> >> Andi Kleen <andi@firstfloor.org> writes: >> >>> From: Andi Kleen <ak@linux.intel.com> >>> >>> With high -j parallelism the autofdo tests can randomly fail. >>> autofdo uses Linux perf to record profiling data. >>> Linux perf uses a locked perf buffer. By default it has >>> around 516k buffer per uid (/proc/sys/kernel/perf_event_mlock_kb). >>> >>> An individual perf record tries to grab the full 516k, >>> which makes parallel perf record fail. >>> >>> This patch limits the perf buffer for individual perf record to 8k. >>> With the default settings this allows a parallelism of the test >>> cases of 16, which is hopefully good enough >> >> So for -jN > 16 it would silently fail again? >> >> I think we should warn when the -jN is sufficiently large such that >> tests will randomly fail, and perhaps suggest workarounds with >> ulimit/etc. > > Not a big fan of warning. I'd rather smell the max, and divide by n, or limit them to 4 (or 16 or any fix constant) and pass appropriate arguments or anything else that just fixes it. > You'll hear no complaints from me :). Aldy
On 05/18/2017 02:39 AM, Richard Biener wrote: > On Thu, May 18, 2017 at 10:35 AM, Aldy Hernandez <aldyh@redhat.com> wrote: >> Andi Kleen <andi@firstfloor.org> writes: >> >>> From: Andi Kleen <ak@linux.intel.com> >>> >>> With high -j parallelism the autofdo tests can randomly fail. >>> autofdo uses Linux perf to record profiling data. >>> Linux perf uses a locked perf buffer. By default it has >>> around 516k buffer per uid (/proc/sys/kernel/perf_event_mlock_kb). >>> >>> An individual perf record tries to grab the full 516k, >>> which makes parallel perf record fail. >>> >>> This patch limits the perf buffer for individual perf record to 8k. >>> With the default settings this allows a parallelism of the test >>> cases of 16, which is hopefully good enough >> >> So for -jN > 16 it would silently fail again? >> >> I think we should warn when the -jN is sufficiently large such that >> tests will randomly fail, and perhaps suggest workarounds with >> ulimit/etc. > > given that make check parallelism is somewhat "explicit" can't we > simply arrange for the chunks to never get more than, say, 4 tree-prof.exp > testcases run at the same time? That would be significantly better than warning. jeff
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index 83e7f2670e6..a22657767db 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -522,9 +522,16 @@ proc check_effective_target_keeps_null_pointer_checks { } { # Return the autofdo profile wrapper +# Linux by default allows 516KB of perf event buffers +# in /proc/sys/kernel/perfe_event_mlock_kb +# Each individual perf tries to grab it +# This causes problems with parallel test suite runs. Instead +# limit us to 8 pages (32K), which should be good enough +# for the small test programs. With the default settings +# this allows parallelism of 16 and higher of parallel gcc-auto-profile proc profopt-perf-wrapper { } { global srcdir - return "$srcdir/../config/i386/gcc-auto-profile -o perf.data " + return "$srcdir/../config/i386/gcc-auto-profile -m8 " } # Return true if profiling is supported on the target.
From: Andi Kleen <ak@linux.intel.com> With high -j parallelism the autofdo tests can randomly fail. autofdo uses Linux perf to record profiling data. Linux perf uses a locked perf buffer. By default it has around 516k buffer per uid (/proc/sys/kernel/perf_event_mlock_kb). An individual perf record tries to grab the full 516k, which makes parallel perf record fail. This patch limits the perf buffer for individual perf record to 8k. With the default settings this allows a parallelism of the test cases of 16, which is hopefully good enough (if not would need to add some kind of semaphore, or ask the user to increase the limit as root) I also removed an unneeded -o perf.data option Thanks to Marcin to finally spotting the problem. Passes bootstrap and test on x86_64-linux. Ok for trunk? gcc/testsuite/: 2017-05-12 Andi Kleen <ak@linux.intel.com> PR testsuite/77684 * lib/target-supports.exp (profopt-perf-wrapper): Remove -p perf.data option. Add -m8 option. --- gcc/testsuite/lib/target-supports.exp | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)