diff mbox series

[bpf-next,v8,09/10] tools/bpf: add a test for bpf_get_stack with raw tracepoint prog

Message ID 20180428070205.1059628-10-yhs@fb.com
State Changes Requested, archived
Delegated to: BPF Maintainers
Headers show
Series bpf: add bpf_get_stack helper | expand

Commit Message

Yonghong Song April 28, 2018, 7:02 a.m. UTC
The test attached a raw_tracepoint program to sched/sched_switch.
It tested to get stack for user space, kernel space and user
space with build_id request. It also tested to get user
and kernel stack into the same buffer with back-to-back
bpf_get_stack helper calls.

Whenever the kernel stack is available, the user space
application will check to ensure that the kernel function
for raw_tracepoint ___bpf_prog_run is part of the stack.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 tools/testing/selftests/bpf/Makefile               |   4 +-
 tools/testing/selftests/bpf/test_get_stack_rawtp.c | 102 +++++++++++++++++
 tools/testing/selftests/bpf/test_progs.c           | 122 +++++++++++++++++++++
 3 files changed, 227 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/bpf/test_get_stack_rawtp.c

Comments

Alexei Starovoitov April 28, 2018, 4:56 p.m. UTC | #1
On Sat, Apr 28, 2018 at 12:02:04AM -0700, Yonghong Song wrote:
> The test attached a raw_tracepoint program to sched/sched_switch.
> It tested to get stack for user space, kernel space and user
> space with build_id request. It also tested to get user
> and kernel stack into the same buffer with back-to-back
> bpf_get_stack helper calls.
> 
> Whenever the kernel stack is available, the user space
> application will check to ensure that the kernel function
> for raw_tracepoint ___bpf_prog_run is part of the stack.
> 
> Signed-off-by: Yonghong Song <yhs@fb.com>
...
> +static int get_stack_print_output(void *data, int size)
> +{
> +	bool good_kern_stack = false, good_user_stack = false;
> +	const char *expected_func = "___bpf_prog_run";

so the test works with interpreter only?
I guess that's ok for now, but needs to fixed for
configs with CONFIG_BPF_JIT_ALWAYS_ON=y
Y Song April 28, 2018, 6:17 p.m. UTC | #2
On Sat, Apr 28, 2018 at 9:56 AM, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
> On Sat, Apr 28, 2018 at 12:02:04AM -0700, Yonghong Song wrote:
>> The test attached a raw_tracepoint program to sched/sched_switch.
>> It tested to get stack for user space, kernel space and user
>> space with build_id request. It also tested to get user
>> and kernel stack into the same buffer with back-to-back
>> bpf_get_stack helper calls.
>>
>> Whenever the kernel stack is available, the user space
>> application will check to ensure that the kernel function
>> for raw_tracepoint ___bpf_prog_run is part of the stack.
>>
>> Signed-off-by: Yonghong Song <yhs@fb.com>
> ...
>> +static int get_stack_print_output(void *data, int size)
>> +{
>> +     bool good_kern_stack = false, good_user_stack = false;
>> +     const char *expected_func = "___bpf_prog_run";
>
> so the test works with interpreter only?
> I guess that's ok for now, but needs to fixed for
> configs with CONFIG_BPF_JIT_ALWAYS_ON=y

I did not test CONFIG_BPF_JIT_ALWAYS_ON=y.
I can have a followup patch for this if the patch set does not need respin.
Alexei Starovoitov April 28, 2018, 7:06 p.m. UTC | #3
On Sat, Apr 28, 2018 at 11:17:30AM -0700, Y Song wrote:
> On Sat, Apr 28, 2018 at 9:56 AM, Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> > On Sat, Apr 28, 2018 at 12:02:04AM -0700, Yonghong Song wrote:
> >> The test attached a raw_tracepoint program to sched/sched_switch.
> >> It tested to get stack for user space, kernel space and user
> >> space with build_id request. It also tested to get user
> >> and kernel stack into the same buffer with back-to-back
> >> bpf_get_stack helper calls.
> >>
> >> Whenever the kernel stack is available, the user space
> >> application will check to ensure that the kernel function
> >> for raw_tracepoint ___bpf_prog_run is part of the stack.
> >>
> >> Signed-off-by: Yonghong Song <yhs@fb.com>
> > ...
> >> +static int get_stack_print_output(void *data, int size)
> >> +{
> >> +     bool good_kern_stack = false, good_user_stack = false;
> >> +     const char *expected_func = "___bpf_prog_run";
> >
> > so the test works with interpreter only?
> > I guess that's ok for now, but needs to fixed for
> > configs with CONFIG_BPF_JIT_ALWAYS_ON=y
> 
> I did not test CONFIG_BPF_JIT_ALWAYS_ON=y.
> I can have a followup patch for this if the patch set does not need respin.

I was thinking to apply the set and do the fix in the follow up,
but testing it with jit_enable=1 I don't see it's failing,
so something is wrong with the test.
Also get_stack_raw_tp_action() keeps spawning new 'dd' in the background
which is not killed after test stops.
Please fix both issues in respin.
Yonghong Song April 28, 2018, 8:02 p.m. UTC | #4
On 4/28/18 12:06 PM, Alexei Starovoitov wrote:
> On Sat, Apr 28, 2018 at 11:17:30AM -0700, Y Song wrote:
>> On Sat, Apr 28, 2018 at 9:56 AM, Alexei Starovoitov
>> <alexei.starovoitov@gmail.com> wrote:
>>> On Sat, Apr 28, 2018 at 12:02:04AM -0700, Yonghong Song wrote:
>>>> The test attached a raw_tracepoint program to sched/sched_switch.
>>>> It tested to get stack for user space, kernel space and user
>>>> space with build_id request. It also tested to get user
>>>> and kernel stack into the same buffer with back-to-back
>>>> bpf_get_stack helper calls.
>>>>
>>>> Whenever the kernel stack is available, the user space
>>>> application will check to ensure that the kernel function
>>>> for raw_tracepoint ___bpf_prog_run is part of the stack.
>>>>
>>>> Signed-off-by: Yonghong Song <yhs@fb.com>
>>> ...
>>>> +static int get_stack_print_output(void *data, int size)
>>>> +{
>>>> +     bool good_kern_stack = false, good_user_stack = false;
>>>> +     const char *expected_func = "___bpf_prog_run";
>>>
>>> so the test works with interpreter only?
>>> I guess that's ok for now, but needs to fixed for
>>> configs with CONFIG_BPF_JIT_ALWAYS_ON=y
>>
>> I did not test CONFIG_BPF_JIT_ALWAYS_ON=y.
>> I can have a followup patch for this if the patch set does not need respin.
> 
> I was thinking to apply the set and do the fix in the follow up,
> but testing it with jit_enable=1 I don't see it's failing,
> so something is wrong with the test.

Yes, it is because the return value test

if (CHECK(err < 0, "perf_event_poller", "err %d errno %d\n", err,
...

the "err < 0" is not right as all the return values are nonnegative.


> Also get_stack_raw_tp_action() keeps spawning new 'dd' in the background
> which is not killed after test stops.
> Please fix both issues in respin.

I will fix both and resend the patch.
diff mbox series

Patch

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index b64a7a3..9d76218 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -32,7 +32,8 @@  TEST_GEN_FILES = test_pkt_access.o test_xdp.o test_l4lb.o test_tcp_estats.o test
 	test_l4lb_noinline.o test_xdp_noinline.o test_stacktrace_map.o \
 	sample_map_ret0.o test_tcpbpf_kern.o test_stacktrace_build_id.o \
 	sockmap_tcp_msg_prog.o connect4_prog.o connect6_prog.o test_adjust_tail.o \
-	test_btf_haskv.o test_btf_nokv.o test_sockmap_kern.o test_tunnel_kern.o
+	test_btf_haskv.o test_btf_nokv.o test_sockmap_kern.o test_tunnel_kern.o \
+	test_get_stack_rawtp.o
 
 # Order correspond to 'make run_tests' order
 TEST_PROGS := test_kmod.sh \
@@ -58,6 +59,7 @@  $(OUTPUT)/test_dev_cgroup: cgroup_helpers.c
 $(OUTPUT)/test_sock: cgroup_helpers.c
 $(OUTPUT)/test_sock_addr: cgroup_helpers.c
 $(OUTPUT)/test_sockmap: cgroup_helpers.c
+$(OUTPUT)/test_progs: trace_helpers.c
 
 .PHONY: force
 
diff --git a/tools/testing/selftests/bpf/test_get_stack_rawtp.c b/tools/testing/selftests/bpf/test_get_stack_rawtp.c
new file mode 100644
index 0000000..ba1dcf9
--- /dev/null
+++ b/tools/testing/selftests/bpf/test_get_stack_rawtp.c
@@ -0,0 +1,102 @@ 
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/bpf.h>
+#include "bpf_helpers.h"
+
+/* Permit pretty deep stack traces */
+#define MAX_STACK_RAWTP 100
+struct stack_trace_t {
+	int pid;
+	int kern_stack_size;
+	int user_stack_size;
+	int user_stack_buildid_size;
+	__u64 kern_stack[MAX_STACK_RAWTP];
+	__u64 user_stack[MAX_STACK_RAWTP];
+	struct bpf_stack_build_id user_stack_buildid[MAX_STACK_RAWTP];
+};
+
+struct bpf_map_def SEC("maps") perfmap = {
+	.type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
+	.key_size = sizeof(int),
+	.value_size = sizeof(__u32),
+	.max_entries = 2,
+};
+
+struct bpf_map_def SEC("maps") stackdata_map = {
+	.type = BPF_MAP_TYPE_PERCPU_ARRAY,
+	.key_size = sizeof(__u32),
+	.value_size = sizeof(struct stack_trace_t),
+	.max_entries = 1,
+};
+
+/* Allocate per-cpu space twice the needed. For the code below
+ *   usize = bpf_get_stack(ctx, raw_data, max_len, BPF_F_USER_STACK);
+ *   if (usize < 0)
+ *     return 0;
+ *   ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
+ *
+ * If we have value_size = MAX_STACK_RAWTP * sizeof(__u64),
+ * verifier will complain that access "raw_data + usize"
+ * with size "max_len - usize" may be out of bound.
+ * The maximum "raw_data + usize" is "raw_data + max_len"
+ * and the maximum "max_len - usize" is "max_len", verifier
+ * concludes that the maximum buffer access range is
+ * "raw_data[0...max_len * 2 - 1]" and hence reject the program.
+ *
+ * Doubling the to-be-used max buffer size can fix this verifier
+ * issue and avoid complicated C programming massaging.
+ * This is an acceptable workaround since there is one entry here.
+ */
+struct bpf_map_def SEC("maps") rawdata_map = {
+	.type = BPF_MAP_TYPE_PERCPU_ARRAY,
+	.key_size = sizeof(__u32),
+	.value_size = MAX_STACK_RAWTP * sizeof(__u64) * 2,
+	.max_entries = 1,
+};
+
+SEC("tracepoint/sched/sched_switch")
+int bpf_prog1(void *ctx)
+{
+	int max_len, max_buildid_len, usize, ksize, total_size;
+	struct stack_trace_t *data;
+	void *raw_data;
+	__u32 key = 0;
+
+	data = bpf_map_lookup_elem(&stackdata_map, &key);
+	if (!data)
+		return 0;
+
+	max_len = MAX_STACK_RAWTP * sizeof(__u64);
+	max_buildid_len = MAX_STACK_RAWTP * sizeof(struct bpf_stack_build_id);
+	data->pid = bpf_get_current_pid_tgid();
+	data->kern_stack_size = bpf_get_stack(ctx, data->kern_stack,
+					      max_len, 0);
+	data->user_stack_size = bpf_get_stack(ctx, data->user_stack, max_len,
+					    BPF_F_USER_STACK);
+	data->user_stack_buildid_size = bpf_get_stack(
+		ctx, data->user_stack_buildid, max_buildid_len,
+		BPF_F_USER_STACK | BPF_F_USER_BUILD_ID);
+	bpf_perf_event_output(ctx, &perfmap, 0, data, sizeof(*data));
+
+	/* write both kernel and user stacks to the same buffer */
+	raw_data = bpf_map_lookup_elem(&rawdata_map, &key);
+	if (!raw_data)
+		return 0;
+
+	usize = bpf_get_stack(ctx, raw_data, max_len, BPF_F_USER_STACK);
+	if (usize < 0)
+		return 0;
+
+	ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
+	if (ksize < 0)
+		return 0;
+
+	total_size = usize + ksize;
+	if (total_size > 0 && total_size <= max_len)
+		bpf_perf_event_output(ctx, &perfmap, 0, raw_data, total_size);
+
+	return 0;
+}
+
+char _license[] SEC("license") = "GPL";
+__u32 _version SEC("version") = 1; /* ignored by tracepoints, required by libbpf.a */
diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c
index eedda98..c148a55 100644
--- a/tools/testing/selftests/bpf/test_progs.c
+++ b/tools/testing/selftests/bpf/test_progs.c
@@ -38,6 +38,7 @@  typedef __u16 __sum16;
 #include "bpf_util.h"
 #include "bpf_endian.h"
 #include "bpf_rlimit.h"
+#include "trace_helpers.h"
 
 static int error_cnt, pass_cnt;
 
@@ -1204,6 +1205,126 @@  static void test_stacktrace_build_id(void)
 	return;
 }
 
+#define MAX_CNT_RAWTP	10ull
+#define MAX_STACK_RAWTP	100
+struct get_stack_trace_t {
+	int pid;
+	int kern_stack_size;
+	int user_stack_size;
+	int user_stack_buildid_size;
+	__u64 kern_stack[MAX_STACK_RAWTP];
+	__u64 user_stack[MAX_STACK_RAWTP];
+	struct bpf_stack_build_id user_stack_buildid[MAX_STACK_RAWTP];
+};
+
+static void get_stack_raw_tp_action(void)
+{
+	FILE *f;
+
+	f = popen("taskset 1 dd if=/dev/zero of=/dev/null", "r");
+	(void) f;
+}
+
+static int get_stack_print_output(void *data, int size)
+{
+	bool good_kern_stack = false, good_user_stack = false;
+	const char *expected_func = "___bpf_prog_run";
+	struct get_stack_trace_t *e = data;
+	int i, num_stack;
+	static __u64 cnt;
+	struct ksym *ks;
+
+	cnt++;
+
+	if (size < sizeof(struct get_stack_trace_t)) {
+		__u64 *raw_data = data;
+
+		num_stack = size / sizeof(__u64);
+		for (i = 0; i < num_stack; i++) {
+			ks = ksym_search(raw_data[i]);
+			if (ks && (strcmp(ks->name, expected_func) == 0)) {
+				good_kern_stack = true;
+				good_user_stack = (i > 0);
+			}
+		}
+	} else {
+		if (e->kern_stack_size > 0) {
+			num_stack = e->kern_stack_size / sizeof(__u64);
+			for (i = 0; i < num_stack; i++) {
+				ks = ksym_search(e->kern_stack[i]);
+				if (ks && (strcmp(ks->name, expected_func) == 0))
+					good_kern_stack = true;
+			}
+		}
+		if (e->user_stack_size > 0 && e->user_stack_buildid_size > 0)
+			good_user_stack = true;
+	}
+	if (!good_kern_stack || !good_user_stack)
+		return PERF_EVENT_ERROR;
+
+	if (cnt == MAX_CNT_RAWTP)
+		return PERF_EVENT_DONE;
+
+	return PERF_EVENT_CONT;
+}
+
+static void test_get_stack_raw_tp(void)
+{
+	const char *file = "./test_get_stack_rawtp.o";
+	int efd, err, prog_fd, pmu_fd, perfmap_fd;
+	struct perf_event_attr attr = {};
+	__u32 key = 0, duration = 0;
+	struct bpf_object *obj;
+
+	err = bpf_prog_load(file, BPF_PROG_TYPE_RAW_TRACEPOINT, &obj, &prog_fd);
+	if (CHECK(err, "prog_load raw tp", "err %d errno %d\n", err, errno))
+		return;
+
+	efd = bpf_raw_tracepoint_open("sched_switch", prog_fd);
+	if (CHECK(efd < 0, "raw_tp_open", "err %d errno %d\n", efd, errno))
+		goto close_prog;
+
+	perfmap_fd = bpf_find_map(__func__, obj, "perfmap");
+	if (CHECK(perfmap_fd < 0, "bpf_find_map", "err %d errno %d\n",
+		  perfmap_fd, errno))
+		goto close_prog;
+
+	err = load_kallsyms();
+	if (CHECK(err < 0, "load_kallsyms", "err %d errno %d\n", err, errno))
+		goto close_prog;
+
+	attr.sample_type = PERF_SAMPLE_RAW;
+	attr.type = PERF_TYPE_SOFTWARE;
+	attr.config = PERF_COUNT_SW_BPF_OUTPUT;
+	pmu_fd = syscall(__NR_perf_event_open, &attr, -1/*pid*/, 0/*cpu*/,
+			 -1/*group_fd*/, 0);
+	if (CHECK(pmu_fd < 0, "perf_event_open", "err %d errno %d\n", pmu_fd,
+		  errno))
+		goto close_prog;
+
+	err = bpf_map_update_elem(perfmap_fd, &key, &pmu_fd, BPF_ANY);
+	if (CHECK(err < 0, "bpf_map_update_elem", "err %d errno %d\n", err,
+		  errno))
+		goto close_prog;
+
+	err = ioctl(pmu_fd, PERF_EVENT_IOC_ENABLE, 0);
+	if (CHECK(err < 0, "ioctl PERF_EVENT_IOC_ENABLE", "err %d errno %d\n",
+		  err, errno))
+		goto close_prog;
+
+	err = perf_event_poller(pmu_fd, get_stack_raw_tp_action,
+				get_stack_print_output);
+	if (CHECK(err < 0, "perf_event_poller", "err %d errno %d\n", err,
+		  errno))
+		goto close_prog;
+
+	goto close_prog_noerr;
+close_prog:
+	error_cnt++;
+close_prog_noerr:
+	bpf_object__close(obj);
+}
+
 int main(void)
 {
 	test_pkt_access();
@@ -1219,6 +1340,7 @@  int main(void)
 	test_stacktrace_map();
 	test_stacktrace_build_id();
 	test_stacktrace_map_raw_tp();
+	test_get_stack_raw_tp();
 
 	printf("Summary: %d PASSED, %d FAILED\n", pass_cnt, error_cnt);
 	return error_cnt ? EXIT_FAILURE : EXIT_SUCCESS;