mbox series

[0/6] enable creating [k,u]probe with perf_event_open

Message ID 20171115172339.1791161-1-songliubraving@fb.com
Headers show
Series enable creating [k,u]probe with perf_event_open | expand

Message

Song Liu Nov. 15, 2017, 5:23 p.m. UTC
Changes RFC v2 to PATCH v1:
  Check type PERF_TYPE_PROBE in perf_event_set_filter().
  Rebase on to tip perf/core.

Changes RFC v1 to RFC v2:
  Fix build issue reported by kbuild test bot by adding ifdef of
  CONFIG_KPROBE_EVENTS, and CONFIG_UPROBE_EVENTS.

RFC v1 cover letter:

This is to follow up the discussion over "new kprobe api" at Linux
Plumbers 2017:

https://www.linuxplumbersconf.org/2017/ocw/proposals/4808

With current kernel, user space tools can only create/destroy [k,u]probes
with a text-based API (kprobe_events and uprobe_events in tracefs). This
approach relies on user space to clean up the [k,u]probe after using them.
However, this is not easy for user space to clean up properly.

To solve this problem, we introduce a file descriptor based API.
Specifically, we extended perf_event_open to create [k,u]probe, and attach
this [k,u]probe to the file descriptor created by perf_event_open. These
[k,u]probe are associated with this file descriptor, so they are not
available in tracefs.

We reuse large portion of existing trace_kprobe and trace_uprobe code.
Currently, the file descriptor API does not support arguments as the
text-based API does. This should not be a problem, as user of the file
decriptor based API read data through other methods (bpf, etc.).

I also include a patch to to bcc, and a patch to man-page perf_even_open.
Please see the list below. A fork of bcc with this patch is also available
on github:

  https://github.com/liu-song-6/bcc/tree/perf_event_opn

Thanks,
Song

man-pages patch:
  perf_event_open.2: add new type PERF_TYPE_PROBE

bcc patch:
  bcc: Try use new API to create [k,u]probe with perf_event_open

kernel patches:

Song Liu (6):
  perf: Add new type PERF_TYPE_PROBE
  perf: copy new perf_event.h to tools/include/uapi
  perf: implement kprobe support to PERF_TYPE_PROBE
  perf: implement uprobe support to PERF_TYPE_PROBE
  bpf: add option for bpf_load.c to use PERF_TYPE_PROBE
  bpf: add new test test_many_kprobe

 include/linux/trace_events.h          |   2 +
 include/uapi/linux/perf_event.h       |  35 ++++++-
 kernel/events/core.c                  |  41 +++++++-
 kernel/trace/trace_event_perf.c       | 127 +++++++++++++++++++++++
 kernel/trace/trace_kprobe.c           |  91 +++++++++++++++--
 kernel/trace/trace_probe.h            |  11 ++
 kernel/trace/trace_uprobe.c           |  90 +++++++++++++++--
 samples/bpf/Makefile                  |   3 +
 samples/bpf/bpf_load.c                |  61 ++++++-----
 samples/bpf/bpf_load.h                |  12 +++
 samples/bpf/test_many_kprobe_user.c   | 184 ++++++++++++++++++++++++++++++++++
 tools/include/uapi/linux/perf_event.h |  35 ++++++-
 12 files changed, 644 insertions(+), 48 deletions(-)
 create mode 100644 samples/bpf/test_many_kprobe_user.c

--
2.9.5

Comments

Alexei Starovoitov Nov. 22, 2017, 5 a.m. UTC | #1
On Wed, Nov 15, 2017 at 09:23:31AM -0800, Song Liu wrote:
> Changes RFC v2 to PATCH v1:
>   Check type PERF_TYPE_PROBE in perf_event_set_filter().
>   Rebase on to tip perf/core.
> 
> Changes RFC v1 to RFC v2:
>   Fix build issue reported by kbuild test bot by adding ifdef of
>   CONFIG_KPROBE_EVENTS, and CONFIG_UPROBE_EVENTS.
> 
> RFC v1 cover letter:
> 
> This is to follow up the discussion over "new kprobe api" at Linux
> Plumbers 2017:
> 
> With current kernel, user space tools can only create/destroy [k,u]probes
> with a text-based API (kprobe_events and uprobe_events in tracefs). This
> approach relies on user space to clean up the [k,u]probe after using them.
> However, this is not easy for user space to clean up properly.
> 
> To solve this problem, we introduce a file descriptor based API.
> Specifically, we extended perf_event_open to create [k,u]probe, and attach
> this [k,u]probe to the file descriptor created by perf_event_open. These
> [k,u]probe are associated with this file descriptor, so they are not
> available in tracefs.

Peter, Ingo,
could you please review the proposed perf_event_open api extension?
Christoph Hellwig Nov. 23, 2017, 9:02 a.m. UTC | #2
Just curious: why do you want to overload a multiplexer syscall even
more instead of adding explicit syscalls?
Peter Zijlstra Nov. 23, 2017, 9:49 a.m. UTC | #3
On Thu, Nov 23, 2017 at 01:02:00AM -0800, Christoph Hellwig wrote:
> Just curious: why do you want to overload a multiplexer syscall even
> more instead of adding explicit syscalls?

Mostly because perf provides much of what they already want; fd-based
lifetime and bpf integration.