diff mbox series

[bpf-next] selftests/bpf: fix test_progs send_signal flakiness with nmi mode

Message ID 20200116174004.1522812-1-yhs@fb.com
State Accepted
Delegated to: BPF Maintainers
Headers show
Series [bpf-next] selftests/bpf: fix test_progs send_signal flakiness with nmi mode | expand

Commit Message

Yonghong Song Jan. 16, 2020, 5:40 p.m. UTC
Alexei observed that test_progs send_signal may fail if run
with command line "./test_progs" and the tests will pass
if just run "./test_progs -n 40".

I observed similar issue with nmi subtest failure
and added a delay 100 us in Commit ab8b7f0cb358
("tools/bpf: Add self tests for bpf_send_signal_thread()")
and the problem is gone for me. But the issue still exists
in Alexei's testing environment.

The current code uses sample_freq = 50 (50 events/second), which
may not be enough. But if the sample_freq value is larger than
sysctl kernel/perf_event_max_sample_rate, the perf_event_open
syscall will fail.

This patch changed nmi perf testing to use sample_period = 1,
which means trying to sampling every event. This seems fixing
the issue.

Fixes: ab8b7f0cb358 ("tools/bpf: Add self tests for bpf_send_signal_thread()")
Signed-off-by: Yonghong Song <yhs@fb.com>
---
 tools/testing/selftests/bpf/prog_tests/send_signal.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

Comments

Andrii Nakryiko Jan. 16, 2020, 6:24 p.m. UTC | #1
On Thu, Jan 16, 2020 at 10:06 AM Yonghong Song <yhs@fb.com> wrote:
>
> Alexei observed that test_progs send_signal may fail if run
> with command line "./test_progs" and the tests will pass
> if just run "./test_progs -n 40".
>
> I observed similar issue with nmi subtest failure
> and added a delay 100 us in Commit ab8b7f0cb358
> ("tools/bpf: Add self tests for bpf_send_signal_thread()")
> and the problem is gone for me. But the issue still exists
> in Alexei's testing environment.
>
> The current code uses sample_freq = 50 (50 events/second), which
> may not be enough. But if the sample_freq value is larger than
> sysctl kernel/perf_event_max_sample_rate, the perf_event_open
> syscall will fail.
>
> This patch changed nmi perf testing to use sample_period = 1,
> which means trying to sampling every event. This seems fixing
> the issue.
>
> Fixes: ab8b7f0cb358 ("tools/bpf: Add self tests for bpf_send_signal_thread()")
> Signed-off-by: Yonghong Song <yhs@fb.com>
> ---

Good not to have to rely on arbitrary timeout!

Acked-by: Andrii Nakryiko <andriin@fb.com>

>  tools/testing/selftests/bpf/prog_tests/send_signal.c | 6 +-----
>  1 file changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/send_signal.c b/tools/testing/selftests/bpf/prog_tests/send_signal.c
> index d4cedd86c424..504abb7bfb95 100644
> --- a/tools/testing/selftests/bpf/prog_tests/send_signal.c
> +++ b/tools/testing/selftests/bpf/prog_tests/send_signal.c
> @@ -76,9 +76,6 @@ static void test_send_signal_common(struct perf_event_attr *attr,
>         if (CHECK(!skel, "skel_open_and_load", "skeleton open_and_load failed\n"))
>                 goto skel_open_load_failure;
>
> -       /* add a delay for child thread to ramp up */
> -       usleep(100);
> -
>         if (!attr) {
>                 err = test_send_signal_kern__attach(skel);
>                 if (CHECK(err, "skel_attach", "skeleton attach failed\n")) {
> @@ -155,8 +152,7 @@ static void test_send_signal_perf(bool signal_thread)
>  static void test_send_signal_nmi(bool signal_thread)
>  {
>         struct perf_event_attr attr = {
> -               .sample_freq = 50,
> -               .freq = 1,
> +               .sample_period = 1,
>                 .type = PERF_TYPE_HARDWARE,
>                 .config = PERF_COUNT_HW_CPU_CYCLES,
>         };
> --
> 2.17.1
>
Alexei Starovoitov Jan. 16, 2020, 9:31 p.m. UTC | #2
On Thu, Jan 16, 2020 at 10:25 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Thu, Jan 16, 2020 at 10:06 AM Yonghong Song <yhs@fb.com> wrote:
> >
> > Alexei observed that test_progs send_signal may fail if run
> > with command line "./test_progs" and the tests will pass
> > if just run "./test_progs -n 40".
> >
> > I observed similar issue with nmi subtest failure
> > and added a delay 100 us in Commit ab8b7f0cb358
> > ("tools/bpf: Add self tests for bpf_send_signal_thread()")
> > and the problem is gone for me. But the issue still exists
> > in Alexei's testing environment.
> >
> > The current code uses sample_freq = 50 (50 events/second), which
> > may not be enough. But if the sample_freq value is larger than
> > sysctl kernel/perf_event_max_sample_rate, the perf_event_open
> > syscall will fail.
> >
> > This patch changed nmi perf testing to use sample_period = 1,
> > which means trying to sampling every event. This seems fixing
> > the issue.
> >
> > Fixes: ab8b7f0cb358 ("tools/bpf: Add self tests for bpf_send_signal_thread()")
> > Signed-off-by: Yonghong Song <yhs@fb.com>
> > ---
>
> Good not to have to rely on arbitrary timeout!

Indeed.
Applied. Thanks
diff mbox series

Patch

diff --git a/tools/testing/selftests/bpf/prog_tests/send_signal.c b/tools/testing/selftests/bpf/prog_tests/send_signal.c
index d4cedd86c424..504abb7bfb95 100644
--- a/tools/testing/selftests/bpf/prog_tests/send_signal.c
+++ b/tools/testing/selftests/bpf/prog_tests/send_signal.c
@@ -76,9 +76,6 @@  static void test_send_signal_common(struct perf_event_attr *attr,
 	if (CHECK(!skel, "skel_open_and_load", "skeleton open_and_load failed\n"))
 		goto skel_open_load_failure;
 
-	/* add a delay for child thread to ramp up */
-	usleep(100);
-
 	if (!attr) {
 		err = test_send_signal_kern__attach(skel);
 		if (CHECK(err, "skel_attach", "skeleton attach failed\n")) {
@@ -155,8 +152,7 @@  static void test_send_signal_perf(bool signal_thread)
 static void test_send_signal_nmi(bool signal_thread)
 {
 	struct perf_event_attr attr = {
-		.sample_freq = 50,
-		.freq = 1,
+		.sample_period = 1,
 		.type = PERF_TYPE_HARDWARE,
 		.config = PERF_COUNT_HW_CPU_CYCLES,
 	};