Message ID | 20200116174004.1522812-1-yhs@fb.com |
---|---|
State | Accepted |
Delegated to: | BPF Maintainers |
Headers | show |
Series | [bpf-next] selftests/bpf: fix test_progs send_signal flakiness with nmi mode | expand |
On Thu, Jan 16, 2020 at 10:06 AM Yonghong Song <yhs@fb.com> wrote: > > Alexei observed that test_progs send_signal may fail if run > with command line "./test_progs" and the tests will pass > if just run "./test_progs -n 40". > > I observed similar issue with nmi subtest failure > and added a delay 100 us in Commit ab8b7f0cb358 > ("tools/bpf: Add self tests for bpf_send_signal_thread()") > and the problem is gone for me. But the issue still exists > in Alexei's testing environment. > > The current code uses sample_freq = 50 (50 events/second), which > may not be enough. But if the sample_freq value is larger than > sysctl kernel/perf_event_max_sample_rate, the perf_event_open > syscall will fail. > > This patch changed nmi perf testing to use sample_period = 1, > which means trying to sampling every event. This seems fixing > the issue. > > Fixes: ab8b7f0cb358 ("tools/bpf: Add self tests for bpf_send_signal_thread()") > Signed-off-by: Yonghong Song <yhs@fb.com> > --- Good not to have to rely on arbitrary timeout! Acked-by: Andrii Nakryiko <andriin@fb.com> > tools/testing/selftests/bpf/prog_tests/send_signal.c | 6 +----- > 1 file changed, 1 insertion(+), 5 deletions(-) > > diff --git a/tools/testing/selftests/bpf/prog_tests/send_signal.c b/tools/testing/selftests/bpf/prog_tests/send_signal.c > index d4cedd86c424..504abb7bfb95 100644 > --- a/tools/testing/selftests/bpf/prog_tests/send_signal.c > +++ b/tools/testing/selftests/bpf/prog_tests/send_signal.c > @@ -76,9 +76,6 @@ static void test_send_signal_common(struct perf_event_attr *attr, > if (CHECK(!skel, "skel_open_and_load", "skeleton open_and_load failed\n")) > goto skel_open_load_failure; > > - /* add a delay for child thread to ramp up */ > - usleep(100); > - > if (!attr) { > err = test_send_signal_kern__attach(skel); > if (CHECK(err, "skel_attach", "skeleton attach failed\n")) { > @@ -155,8 +152,7 @@ static void test_send_signal_perf(bool signal_thread) > static void test_send_signal_nmi(bool signal_thread) > { > struct perf_event_attr attr = { > - .sample_freq = 50, > - .freq = 1, > + .sample_period = 1, > .type = PERF_TYPE_HARDWARE, > .config = PERF_COUNT_HW_CPU_CYCLES, > }; > -- > 2.17.1 >
On Thu, Jan 16, 2020 at 10:25 AM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Thu, Jan 16, 2020 at 10:06 AM Yonghong Song <yhs@fb.com> wrote: > > > > Alexei observed that test_progs send_signal may fail if run > > with command line "./test_progs" and the tests will pass > > if just run "./test_progs -n 40". > > > > I observed similar issue with nmi subtest failure > > and added a delay 100 us in Commit ab8b7f0cb358 > > ("tools/bpf: Add self tests for bpf_send_signal_thread()") > > and the problem is gone for me. But the issue still exists > > in Alexei's testing environment. > > > > The current code uses sample_freq = 50 (50 events/second), which > > may not be enough. But if the sample_freq value is larger than > > sysctl kernel/perf_event_max_sample_rate, the perf_event_open > > syscall will fail. > > > > This patch changed nmi perf testing to use sample_period = 1, > > which means trying to sampling every event. This seems fixing > > the issue. > > > > Fixes: ab8b7f0cb358 ("tools/bpf: Add self tests for bpf_send_signal_thread()") > > Signed-off-by: Yonghong Song <yhs@fb.com> > > --- > > Good not to have to rely on arbitrary timeout! Indeed. Applied. Thanks
diff --git a/tools/testing/selftests/bpf/prog_tests/send_signal.c b/tools/testing/selftests/bpf/prog_tests/send_signal.c index d4cedd86c424..504abb7bfb95 100644 --- a/tools/testing/selftests/bpf/prog_tests/send_signal.c +++ b/tools/testing/selftests/bpf/prog_tests/send_signal.c @@ -76,9 +76,6 @@ static void test_send_signal_common(struct perf_event_attr *attr, if (CHECK(!skel, "skel_open_and_load", "skeleton open_and_load failed\n")) goto skel_open_load_failure; - /* add a delay for child thread to ramp up */ - usleep(100); - if (!attr) { err = test_send_signal_kern__attach(skel); if (CHECK(err, "skel_attach", "skeleton attach failed\n")) { @@ -155,8 +152,7 @@ static void test_send_signal_perf(bool signal_thread) static void test_send_signal_nmi(bool signal_thread) { struct perf_event_attr attr = { - .sample_freq = 50, - .freq = 1, + .sample_period = 1, .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CPU_CYCLES, };
Alexei observed that test_progs send_signal may fail if run with command line "./test_progs" and the tests will pass if just run "./test_progs -n 40". I observed similar issue with nmi subtest failure and added a delay 100 us in Commit ab8b7f0cb358 ("tools/bpf: Add self tests for bpf_send_signal_thread()") and the problem is gone for me. But the issue still exists in Alexei's testing environment. The current code uses sample_freq = 50 (50 events/second), which may not be enough. But if the sample_freq value is larger than sysctl kernel/perf_event_max_sample_rate, the perf_event_open syscall will fail. This patch changed nmi perf testing to use sample_period = 1, which means trying to sampling every event. This seems fixing the issue. Fixes: ab8b7f0cb358 ("tools/bpf: Add self tests for bpf_send_signal_thread()") Signed-off-by: Yonghong Song <yhs@fb.com> --- tools/testing/selftests/bpf/prog_tests/send_signal.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)