diff mbox series

[bpf-next,1/5] bpf: block bpf_get_[stack|stackid] on perf_event with PEBS entries

Message ID 20200711012639.3429622-2-songliubraving@fb.com
State Changes Requested
Delegated to: BPF Maintainers
Headers show
Series bpf: fix stackmap on perf_events with PEBS | expand

Commit Message

Song Liu July 11, 2020, 1:26 a.m. UTC
Calling get_perf_callchain() on perf_events from PEBS entries may cause
unwinder errors. To fix this issue, the callchain is fetched early. Such
perf_events are marked with __PERF_SAMPLE_CALLCHAIN_EARLY.

Similarly, calling bpf_get_[stack|stackid] on perf_events from PEBS may
also cause unwinder errors. To fix this, block bpf_get_[stack|stackid] on
these perf_events. Unfortunately, bpf verifier cannot tell whether the
program will be attached to perf_event with PEBS entries. Therefore,
block such programs during ioctl(PERF_EVENT_IOC_SET_BPF).

Signed-off-by: Song Liu <songliubraving@fb.com>
---
 include/linux/filter.h |  3 ++-
 kernel/bpf/verifier.c  |  3 +++
 kernel/events/core.c   | 10 ++++++++++
 3 files changed, 15 insertions(+), 1 deletion(-)

Comments

Andrii Nakryiko July 11, 2020, 3:53 a.m. UTC | #1
On Fri, Jul 10, 2020 at 6:30 PM Song Liu <songliubraving@fb.com> wrote:
>
> Calling get_perf_callchain() on perf_events from PEBS entries may cause
> unwinder errors. To fix this issue, the callchain is fetched early. Such
> perf_events are marked with __PERF_SAMPLE_CALLCHAIN_EARLY.
>
> Similarly, calling bpf_get_[stack|stackid] on perf_events from PEBS may
> also cause unwinder errors. To fix this, block bpf_get_[stack|stackid] on
> these perf_events. Unfortunately, bpf verifier cannot tell whether the
> program will be attached to perf_event with PEBS entries. Therefore,
> block such programs during ioctl(PERF_EVENT_IOC_SET_BPF).
>
> Signed-off-by: Song Liu <songliubraving@fb.com>
> ---

Perhaps it's a stupid question, but why bpf_get_stack/bpf_get_stackid
can't figure out automatically that they are called from
__PERF_SAMPLE_CALLCHAIN_EARLY perf event and use different callchain,
if necessary?

It is quite suboptimal from a user experience point of view to require
two different BPF helpers depending on PEBS or non-PEBS perf events.

[...]
Song Liu July 11, 2020, 6:28 a.m. UTC | #2
> On Jul 10, 2020, at 8:53 PM, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> 
> On Fri, Jul 10, 2020 at 6:30 PM Song Liu <songliubraving@fb.com> wrote:
>> 
>> Calling get_perf_callchain() on perf_events from PEBS entries may cause
>> unwinder errors. To fix this issue, the callchain is fetched early. Such
>> perf_events are marked with __PERF_SAMPLE_CALLCHAIN_EARLY.
>> 
>> Similarly, calling bpf_get_[stack|stackid] on perf_events from PEBS may
>> also cause unwinder errors. To fix this, block bpf_get_[stack|stackid] on
>> these perf_events. Unfortunately, bpf verifier cannot tell whether the
>> program will be attached to perf_event with PEBS entries. Therefore,
>> block such programs during ioctl(PERF_EVENT_IOC_SET_BPF).
>> 
>> Signed-off-by: Song Liu <songliubraving@fb.com>
>> ---
> 
> Perhaps it's a stupid question, but why bpf_get_stack/bpf_get_stackid
> can't figure out automatically that they are called from
> __PERF_SAMPLE_CALLCHAIN_EARLY perf event and use different callchain,
> if necessary?
> 
> It is quite suboptimal from a user experience point of view to require
> two different BPF helpers depending on PEBS or non-PEBS perf events.

I am not aware of an easy way to tell the difference in bpf_get_stack. 
But I do agree that would be much better. 

Thanks,
Song
Andrii Nakryiko July 12, 2020, 5:06 a.m. UTC | #3
On Fri, Jul 10, 2020 at 11:28 PM Song Liu <songliubraving@fb.com> wrote:
>
>
>
> > On Jul 10, 2020, at 8:53 PM, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> >
> > On Fri, Jul 10, 2020 at 6:30 PM Song Liu <songliubraving@fb.com> wrote:
> >>
> >> Calling get_perf_callchain() on perf_events from PEBS entries may cause
> >> unwinder errors. To fix this issue, the callchain is fetched early. Such
> >> perf_events are marked with __PERF_SAMPLE_CALLCHAIN_EARLY.
> >>
> >> Similarly, calling bpf_get_[stack|stackid] on perf_events from PEBS may
> >> also cause unwinder errors. To fix this, block bpf_get_[stack|stackid] on
> >> these perf_events. Unfortunately, bpf verifier cannot tell whether the
> >> program will be attached to perf_event with PEBS entries. Therefore,
> >> block such programs during ioctl(PERF_EVENT_IOC_SET_BPF).
> >>
> >> Signed-off-by: Song Liu <songliubraving@fb.com>
> >> ---
> >
> > Perhaps it's a stupid question, but why bpf_get_stack/bpf_get_stackid
> > can't figure out automatically that they are called from
> > __PERF_SAMPLE_CALLCHAIN_EARLY perf event and use different callchain,
> > if necessary?
> >
> > It is quite suboptimal from a user experience point of view to require
> > two different BPF helpers depending on PEBS or non-PEBS perf events.
>
> I am not aware of an easy way to tell the difference in bpf_get_stack.
> But I do agree that would be much better.
>

Hm... Looking a bit more how all this is tied together in the kernel,
I think it's actually quite easy. So, for perf_event BPF program type:

1. return a special prototype for bpf_get_stack/bpf_get_stackid, which
will have this extra bit of logic for callchain. All other program
types with access to bpf_get_stack/bpf_get_stackid should use the
current one, probably.
2. For that special program, just like for bpf_read_branch_records(),
we know that context is actually `struct bpf_perf_event_data_kern *`,
and it has pt_regs, perf_sample_data and perf_event itself.
3. With that, it seems like you'll have everything you need to
automatically choose a proper callchain.

All this absolutely transparently to the BPF program.

Am I missing something?

> Thanks,
> Song
Song Liu July 12, 2020, 6:34 a.m. UTC | #4
> On Jul 11, 2020, at 10:06 PM, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> 
> On Fri, Jul 10, 2020 at 11:28 PM Song Liu <songliubraving@fb.com> wrote:
>> 
>> 
>> 
>>> On Jul 10, 2020, at 8:53 PM, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>>> 
>>> On Fri, Jul 10, 2020 at 6:30 PM Song Liu <songliubraving@fb.com> wrote:
>>>> 
>>>> Calling get_perf_callchain() on perf_events from PEBS entries may cause
>>>> unwinder errors. To fix this issue, the callchain is fetched early. Such
>>>> perf_events are marked with __PERF_SAMPLE_CALLCHAIN_EARLY.
>>>> 
>>>> Similarly, calling bpf_get_[stack|stackid] on perf_events from PEBS may
>>>> also cause unwinder errors. To fix this, block bpf_get_[stack|stackid] on
>>>> these perf_events. Unfortunately, bpf verifier cannot tell whether the
>>>> program will be attached to perf_event with PEBS entries. Therefore,
>>>> block such programs during ioctl(PERF_EVENT_IOC_SET_BPF).
>>>> 
>>>> Signed-off-by: Song Liu <songliubraving@fb.com>
>>>> ---
>>> 
>>> Perhaps it's a stupid question, but why bpf_get_stack/bpf_get_stackid
>>> can't figure out automatically that they are called from
>>> __PERF_SAMPLE_CALLCHAIN_EARLY perf event and use different callchain,
>>> if necessary?
>>> 
>>> It is quite suboptimal from a user experience point of view to require
>>> two different BPF helpers depending on PEBS or non-PEBS perf events.
>> 
>> I am not aware of an easy way to tell the difference in bpf_get_stack.
>> But I do agree that would be much better.
>> 
> 
> Hm... Looking a bit more how all this is tied together in the kernel,
> I think it's actually quite easy. So, for perf_event BPF program type:
> 
> 1. return a special prototype for bpf_get_stack/bpf_get_stackid, which
> will have this extra bit of logic for callchain. All other program
> types with access to bpf_get_stack/bpf_get_stackid should use the
> current one, probably.
> 2. For that special program, just like for bpf_read_branch_records(),
> we know that context is actually `struct bpf_perf_event_data_kern *`,
> and it has pt_regs, perf_sample_data and perf_event itself.
> 3. With that, it seems like you'll have everything you need to
> automatically choose a proper callchain.
> 
> All this absolutely transparently to the BPF program.
> 
> Am I missing something?

Good idea! A separate prototype should work here. 

Thanks,
Song
diff mbox series

Patch

diff --git a/include/linux/filter.h b/include/linux/filter.h
index 2593777236037..fb34dc40f039b 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -534,7 +534,8 @@  struct bpf_prog {
 				is_func:1,	/* program is a bpf function */
 				kprobe_override:1, /* Do we override a kprobe? */
 				has_callchain_buf:1, /* callchain buffer allocated? */
-				enforce_expected_attach_type:1; /* Enforce expected_attach_type checking at attach time */
+				enforce_expected_attach_type:1, /* Enforce expected_attach_type checking at attach time */
+				call_get_perf_callchain:1; /* Do we call helpers that uses get_perf_callchain()? */
 	enum bpf_prog_type	type;		/* Type of BPF program */
 	enum bpf_attach_type	expected_attach_type; /* For some prog types */
 	u32			len;		/* Number of filter blocks */
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index b608185e1ffd5..1e11b0f6fba31 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -4884,6 +4884,9 @@  static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn
 		env->prog->has_callchain_buf = true;
 	}
 
+	if (func_id == BPF_FUNC_get_stackid || func_id == BPF_FUNC_get_stack)
+		env->prog->call_get_perf_callchain = true;
+
 	if (changes_data)
 		clear_all_pkt_pointers(env);
 	return 0;
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 856d98c36f562..f2f575a286bb4 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -9544,6 +9544,16 @@  static int perf_event_set_bpf_handler(struct perf_event *event, u32 prog_fd)
 	if (IS_ERR(prog))
 		return PTR_ERR(prog);
 
+	if ((event->attr.sample_type & __PERF_SAMPLE_CALLCHAIN_EARLY) &&
+	    prog->call_get_perf_callchain) {
+		/*
+		 * The perf_event get_perf_callchain() early, the attached
+		 * BPF program shouldn't call get_perf_callchain() again.
+		 */
+		bpf_prog_put(prog);
+		return -EINVAL;
+	}
+
 	event->prog = prog;
 	event->orig_overflow_handler = READ_ONCE(event->overflow_handler);
 	WRITE_ONCE(event->overflow_handler, bpf_overflow_handler);