diff mbox

Limit perf data buffer during profiling

Message ID 20170512093836.19942-1-andi@firstfloor.org
State New
Headers show

Commit Message

Andi Kleen May 12, 2017, 9:38 a.m. UTC
From: Andi Kleen <ak@linux.intel.com>

With high -j parallelism the autofdo tests can randomly fail.
autofdo uses Linux perf to record profiling data.
Linux perf uses a locked perf buffer. By default it has
around 516k buffer per uid (/proc/sys/kernel/perf_event_mlock_kb).

An individual perf record tries to grab the full 516k,
which makes parallel perf record fail.

This patch limits the perf buffer for individual perf record to 8k.
With the default settings this allows a parallelism of the test
cases of 16, which is hopefully good enough

(if not would need to add some kind of semaphore, or ask
the user to increase the limit as root)

I also removed an unneeded -o perf.data option

Thanks to Marcin to finally spotting the problem.

Passes bootstrap and test on x86_64-linux. Ok for trunk?

gcc/testsuite/:

2017-05-12  Andi Kleen  <ak@linux.intel.com>

	PR testsuite/77684
	* lib/target-supports.exp (profopt-perf-wrapper): Remove
	-p perf.data option. Add -m8 option.
---
 gcc/testsuite/lib/target-supports.exp | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Comments

Richard Biener May 12, 2017, 9:43 a.m. UTC | #1
On Fri, May 12, 2017 at 11:38 AM, Andi Kleen <andi@firstfloor.org> wrote:
> From: Andi Kleen <ak@linux.intel.com>
>
> With high -j parallelism the autofdo tests can randomly fail.
> autofdo uses Linux perf to record profiling data.
> Linux perf uses a locked perf buffer. By default it has
> around 516k buffer per uid (/proc/sys/kernel/perf_event_mlock_kb).
>
> An individual perf record tries to grab the full 516k,
> which makes parallel perf record fail.
>
> This patch limits the perf buffer for individual perf record to 8k.
> With the default settings this allows a parallelism of the test
> cases of 16, which is hopefully good enough
>
> (if not would need to add some kind of semaphore, or ask
> the user to increase the limit as root)
>
> I also removed an unneeded -o perf.data option
>
> Thanks to Marcin to finally spotting the problem.
>
> Passes bootstrap and test on x86_64-linux. Ok for trunk?

Ok.  Can you retain -o perf.data (even if that's the default)?

Thanks,
Richard.

> gcc/testsuite/:
>
> 2017-05-12  Andi Kleen  <ak@linux.intel.com>
>
>         PR testsuite/77684
>         * lib/target-supports.exp (profopt-perf-wrapper): Remove
>         -p perf.data option. Add -m8 option.
> ---
>  gcc/testsuite/lib/target-supports.exp | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
> index 83e7f2670e6..a22657767db 100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -522,9 +522,16 @@ proc check_effective_target_keeps_null_pointer_checks { } {
>
>  # Return the autofdo profile wrapper
>
> +# Linux by default allows 516KB of perf event buffers
> +# in /proc/sys/kernel/perfe_event_mlock_kb
> +# Each individual perf tries to grab it
> +# This causes problems with parallel test suite runs. Instead
> +# limit us to 8 pages (32K), which should be good enough
> +# for the small test programs. With the default settings
> +# this allows parallelism of 16 and higher of parallel gcc-auto-profile
>  proc profopt-perf-wrapper { } {
>      global srcdir
> -    return "$srcdir/../config/i386/gcc-auto-profile -o perf.data "
> +    return "$srcdir/../config/i386/gcc-auto-profile -m8 "
>  }
>
>  # Return true if profiling is supported on the target.
> --
> 2.12.2
>
Jeff Law May 12, 2017, 5:51 p.m. UTC | #2
On 05/12/2017 03:38 AM, Andi Kleen wrote:
> From: Andi Kleen <ak@linux.intel.com>
> 
> With high -j parallelism the autofdo tests can randomly fail.
> autofdo uses Linux perf to record profiling data.
> Linux perf uses a locked perf buffer. By default it has
> around 516k buffer per uid (/proc/sys/kernel/perf_event_mlock_kb).
> 
> An individual perf record tries to grab the full 516k,
> which makes parallel perf record fail.
> 
> This patch limits the perf buffer for individual perf record to 8k.
> With the default settings this allows a parallelism of the test
> cases of 16, which is hopefully good enough
I'm routinely building & testing with -j56 these days.  We'll see if 
it's sufficient.

> 
> (if not would need to add some kind of semaphore, or ask
> the user to increase the limit as root)
> 
> I also removed an unneeded -o perf.data option
> 
> Thanks to Marcin to finally spotting the problem.
> 
> Passes bootstrap and test on x86_64-linux. Ok for trunk?
> 
> gcc/testsuite/:
> 
> 2017-05-12  Andi Kleen  <ak@linux.intel.com>
> 
> 	PR testsuite/77684
> 	* lib/target-supports.exp (profopt-perf-wrapper): Remove
> 	-p perf.data option. Add -m8 option.
OK.  But be aware that it may not ultimately be sufficient.

jeff
Aldy Hernandez May 18, 2017, 8:35 a.m. UTC | #3
Andi Kleen <andi@firstfloor.org> writes:

> From: Andi Kleen <ak@linux.intel.com>
>
> With high -j parallelism the autofdo tests can randomly fail.
> autofdo uses Linux perf to record profiling data.
> Linux perf uses a locked perf buffer. By default it has
> around 516k buffer per uid (/proc/sys/kernel/perf_event_mlock_kb).
>
> An individual perf record tries to grab the full 516k,
> which makes parallel perf record fail.
>
> This patch limits the perf buffer for individual perf record to 8k.
> With the default settings this allows a parallelism of the test
> cases of 16, which is hopefully good enough

So for -jN > 16 it would silently fail again?

I think we should warn when the -jN is sufficiently large such that
tests will randomly fail, and perhaps suggest workarounds with
ulimit/etc.

Aldy
Richard Biener May 18, 2017, 8:39 a.m. UTC | #4
On Thu, May 18, 2017 at 10:35 AM, Aldy Hernandez <aldyh@redhat.com> wrote:
> Andi Kleen <andi@firstfloor.org> writes:
>
>> From: Andi Kleen <ak@linux.intel.com>
>>
>> With high -j parallelism the autofdo tests can randomly fail.
>> autofdo uses Linux perf to record profiling data.
>> Linux perf uses a locked perf buffer. By default it has
>> around 516k buffer per uid (/proc/sys/kernel/perf_event_mlock_kb).
>>
>> An individual perf record tries to grab the full 516k,
>> which makes parallel perf record fail.
>>
>> This patch limits the perf buffer for individual perf record to 8k.
>> With the default settings this allows a parallelism of the test
>> cases of 16, which is hopefully good enough
>
> So for -jN > 16 it would silently fail again?
>
> I think we should warn when the -jN is sufficiently large such that
> tests will randomly fail, and perhaps suggest workarounds with
> ulimit/etc.

given that make check parallelism is somewhat "explicit" can't we
simply arrange for the chunks to never get more than, say, 4 tree-prof.exp
testcases run at the same time?

Richard.

> Aldy
Mike Stump May 18, 2017, 4:09 p.m. UTC | #5
On May 18, 2017, at 1:35 AM, Aldy Hernandez <aldyh@redhat.com> wrote:
> 
> Andi Kleen <andi@firstfloor.org> writes:
> 
>> From: Andi Kleen <ak@linux.intel.com>
>> 
>> With high -j parallelism the autofdo tests can randomly fail.
>> autofdo uses Linux perf to record profiling data.
>> Linux perf uses a locked perf buffer. By default it has
>> around 516k buffer per uid (/proc/sys/kernel/perf_event_mlock_kb).
>> 
>> An individual perf record tries to grab the full 516k,
>> which makes parallel perf record fail.
>> 
>> This patch limits the perf buffer for individual perf record to 8k.
>> With the default settings this allows a parallelism of the test
>> cases of 16, which is hopefully good enough
> 
> So for -jN > 16 it would silently fail again?
> 
> I think we should warn when the -jN is sufficiently large such that
> tests will randomly fail, and perhaps suggest workarounds with
> ulimit/etc.

Not a big fan of warning.  I'd rather smell the max, and divide by n, or limit them to 4 (or 16 or any fix constant) and pass appropriate arguments or anything else that just fixes it.
Aldy Hernandez May 18, 2017, 4:30 p.m. UTC | #6
On 05/18/2017 12:09 PM, Mike Stump wrote:
> On May 18, 2017, at 1:35 AM, Aldy Hernandez <aldyh@redhat.com> wrote:
>>
>> Andi Kleen <andi@firstfloor.org> writes:
>>
>>> From: Andi Kleen <ak@linux.intel.com>
>>>
>>> With high -j parallelism the autofdo tests can randomly fail.
>>> autofdo uses Linux perf to record profiling data.
>>> Linux perf uses a locked perf buffer. By default it has
>>> around 516k buffer per uid (/proc/sys/kernel/perf_event_mlock_kb).
>>>
>>> An individual perf record tries to grab the full 516k,
>>> which makes parallel perf record fail.
>>>
>>> This patch limits the perf buffer for individual perf record to 8k.
>>> With the default settings this allows a parallelism of the test
>>> cases of 16, which is hopefully good enough
>>
>> So for -jN > 16 it would silently fail again?
>>
>> I think we should warn when the -jN is sufficiently large such that
>> tests will randomly fail, and perhaps suggest workarounds with
>> ulimit/etc.
> 
> Not a big fan of warning.  I'd rather smell the max, and divide by n, or limit them to 4 (or 16 or any fix constant) and pass appropriate arguments or anything else that just fixes it.
> 

You'll hear no complaints from me :).
Aldy
Jeff Law May 20, 2017, 5:02 a.m. UTC | #7
On 05/18/2017 02:39 AM, Richard Biener wrote:
> On Thu, May 18, 2017 at 10:35 AM, Aldy Hernandez <aldyh@redhat.com> wrote:
>> Andi Kleen <andi@firstfloor.org> writes:
>>
>>> From: Andi Kleen <ak@linux.intel.com>
>>>
>>> With high -j parallelism the autofdo tests can randomly fail.
>>> autofdo uses Linux perf to record profiling data.
>>> Linux perf uses a locked perf buffer. By default it has
>>> around 516k buffer per uid (/proc/sys/kernel/perf_event_mlock_kb).
>>>
>>> An individual perf record tries to grab the full 516k,
>>> which makes parallel perf record fail.
>>>
>>> This patch limits the perf buffer for individual perf record to 8k.
>>> With the default settings this allows a parallelism of the test
>>> cases of 16, which is hopefully good enough
>>
>> So for -jN > 16 it would silently fail again?
>>
>> I think we should warn when the -jN is sufficiently large such that
>> tests will randomly fail, and perhaps suggest workarounds with
>> ulimit/etc.
> 
> given that make check parallelism is somewhat "explicit" can't we
> simply arrange for the chunks to never get more than, say, 4 tree-prof.exp
> testcases run at the same time?
That would be significantly better than warning.

jeff
diff mbox

Patch

diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 83e7f2670e6..a22657767db 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -522,9 +522,16 @@  proc check_effective_target_keeps_null_pointer_checks { } {
 
 # Return the autofdo profile wrapper
 
+# Linux by default allows 516KB of perf event buffers
+# in /proc/sys/kernel/perfe_event_mlock_kb
+# Each individual perf tries to grab it
+# This causes problems with parallel test suite runs. Instead
+# limit us to 8 pages (32K), which should be good enough
+# for the small test programs. With the default settings 
+# this allows parallelism of 16 and higher of parallel gcc-auto-profile
 proc profopt-perf-wrapper { } {
     global srcdir
-    return "$srcdir/../config/i386/gcc-auto-profile -o perf.data "
+    return "$srcdir/../config/i386/gcc-auto-profile -m8 "
 }
 
 # Return true if profiling is supported on the target.