[bpf-next,2/7] bpf: introduce bpf subcommand BPF_PERF_EVENT_QUERY

Message ID 20180515234521.856763-3-yhs@fb.com
State Changes Requested
Delegated to: BPF Maintainers
Headers show
Series
  • bpf: implement BPF_PERF_EVENT_QUERY for perf event query
Related show

Commit Message

Yonghong Song May 15, 2018, 11:45 p.m.
Currently, suppose a userspace application has loaded a bpf program
and attached it to a tracepoint/kprobe/uprobe, and a bpf
introspection tool, e.g., bpftool, wants to show which bpf program
is attached to which tracepoint/kprobe/uprobe. Such attachment
information will be really useful to understand the overall bpf
deployment in the system.

There is a name field (16 bytes) for each program, which could
be used to encode the attachment point. There are some drawbacks
for this approaches. First, bpftool user (e.g., an admin) may not
really understand the association between the name and the
attachment point. Second, if one program is attached to multiple
places, encoding a proper name which can imply all these
attachments becomes difficult.

This patch introduces a new bpf subcommand BPF_PERF_EVENT_QUERY.
Given a pid and fd, if the <pid, fd> is associated with a
tracepoint/kprobe/uprobea perf event, BPF_PERF_EVENT_QUERY will return
   . prog_id
   . tracepoint name, or
   . k[ret]probe funcname + offset or kernel addr, or
   . u[ret]probe filename + offset
to the userspace.
The user can use "bpftool prog" to find more information about
bpf program itself with prog_id.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 include/linux/trace_events.h |  15 ++++++
 include/uapi/linux/bpf.h     |  25 ++++++++++
 kernel/bpf/syscall.c         | 113 +++++++++++++++++++++++++++++++++++++++++++
 kernel/trace/bpf_trace.c     |  53 ++++++++++++++++++++
 kernel/trace/trace_kprobe.c  |  29 +++++++++++
 kernel/trace/trace_uprobe.c  |  22 +++++++++
 6 files changed, 257 insertions(+)

Comments

Peter Zijlstra May 16, 2018, 11:27 a.m. | #1
On Tue, May 15, 2018 at 04:45:16PM -0700, Yonghong Song wrote:
> Currently, suppose a userspace application has loaded a bpf program
> and attached it to a tracepoint/kprobe/uprobe, and a bpf
> introspection tool, e.g., bpftool, wants to show which bpf program
> is attached to which tracepoint/kprobe/uprobe. Such attachment
> information will be really useful to understand the overall bpf
> deployment in the system.
> 
> There is a name field (16 bytes) for each program, which could
> be used to encode the attachment point. There are some drawbacks
> for this approaches. First, bpftool user (e.g., an admin) may not
> really understand the association between the name and the
> attachment point. Second, if one program is attached to multiple
> places, encoding a proper name which can imply all these
> attachments becomes difficult.
> 
> This patch introduces a new bpf subcommand BPF_PERF_EVENT_QUERY.
> Given a pid and fd, if the <pid, fd> is associated with a
> tracepoint/kprobe/uprobea perf event, BPF_PERF_EVENT_QUERY will return
>    . prog_id
>    . tracepoint name, or
>    . k[ret]probe funcname + offset or kernel addr, or
>    . u[ret]probe filename + offset
> to the userspace.
> The user can use "bpftool prog" to find more information about
> bpf program itself with prog_id.
> 
> Signed-off-by: Yonghong Song <yhs@fb.com>
> ---
>  include/linux/trace_events.h |  15 ++++++
>  include/uapi/linux/bpf.h     |  25 ++++++++++
>  kernel/bpf/syscall.c         | 113 +++++++++++++++++++++++++++++++++++++++++++
>  kernel/trace/bpf_trace.c     |  53 ++++++++++++++++++++
>  kernel/trace/trace_kprobe.c  |  29 +++++++++++
>  kernel/trace/trace_uprobe.c  |  22 +++++++++
>  6 files changed, 257 insertions(+)

Why is the command called *_PERF_EVENT_* ? Are there not a lot of !perf
places to attach BPF proglets?
Yonghong Song May 16, 2018, 9:59 p.m. | #2
On 5/16/18 4:27 AM, Peter Zijlstra wrote:
> On Tue, May 15, 2018 at 04:45:16PM -0700, Yonghong Song wrote:
>> Currently, suppose a userspace application has loaded a bpf program
>> and attached it to a tracepoint/kprobe/uprobe, and a bpf
>> introspection tool, e.g., bpftool, wants to show which bpf program
>> is attached to which tracepoint/kprobe/uprobe. Such attachment
>> information will be really useful to understand the overall bpf
>> deployment in the system.
>>
>> There is a name field (16 bytes) for each program, which could
>> be used to encode the attachment point. There are some drawbacks
>> for this approaches. First, bpftool user (e.g., an admin) may not
>> really understand the association between the name and the
>> attachment point. Second, if one program is attached to multiple
>> places, encoding a proper name which can imply all these
>> attachments becomes difficult.
>>
>> This patch introduces a new bpf subcommand BPF_PERF_EVENT_QUERY.
>> Given a pid and fd, if the <pid, fd> is associated with a
>> tracepoint/kprobe/uprobea perf event, BPF_PERF_EVENT_QUERY will return
>>     . prog_id
>>     . tracepoint name, or
>>     . k[ret]probe funcname + offset or kernel addr, or
>>     . u[ret]probe filename + offset
>> to the userspace.
>> The user can use "bpftool prog" to find more information about
>> bpf program itself with prog_id.
>>
>> Signed-off-by: Yonghong Song <yhs@fb.com>
>> ---
>>   include/linux/trace_events.h |  15 ++++++
>>   include/uapi/linux/bpf.h     |  25 ++++++++++
>>   kernel/bpf/syscall.c         | 113 +++++++++++++++++++++++++++++++++++++++++++
>>   kernel/trace/bpf_trace.c     |  53 ++++++++++++++++++++
>>   kernel/trace/trace_kprobe.c  |  29 +++++++++++
>>   kernel/trace/trace_uprobe.c  |  22 +++++++++
>>   6 files changed, 257 insertions(+)
> 
> Why is the command called *_PERF_EVENT_* ? Are there not a lot of !perf
> places to attach BPF proglets?

Just gave a complete picture, the below are major places to attach
BPF programs:
    . perf based (through perf ioctl)
    . raw tracepoint based (through bpf interface)

    . netlink interface for tc, xdp, tunneling
    . setsockopt for socket filters
    . cgroup based (bpf attachment subcommand)
      mostly networking and io devices
    . some other networking socket related (sk_skb stream/parser/verdict,
      sk_msg verdict) through bpf attachment subcommand.

Currently, for cgroup based attachment, we have BPF_PROG_QUERY with 
input cgroup file descriptor. For other networking based queries, we
may need to enumerate tc filters, networking devices, open sockets, etc.
to get the attachment information.

So to have one BPF_QUERY command line may be too complex to
cover all cases.

But you are right that BPF_PERF_EVENT_QUERY name is too narrow since
it should be used for other (pid, fd) based queries as well (e.g., 
socket, or other potential uses in the future).

How about the subcommand name BPF_TASK_FD_QUERY and make 
bpf_attr.task_fd_query extensible?

Thanks!
Daniel Borkmann May 17, 2018, 3:32 p.m. | #3
On 05/16/2018 11:59 PM, Yonghong Song wrote:
> On 5/16/18 4:27 AM, Peter Zijlstra wrote:
>> On Tue, May 15, 2018 at 04:45:16PM -0700, Yonghong Song wrote:
>>> Currently, suppose a userspace application has loaded a bpf program
>>> and attached it to a tracepoint/kprobe/uprobe, and a bpf
>>> introspection tool, e.g., bpftool, wants to show which bpf program
>>> is attached to which tracepoint/kprobe/uprobe. Such attachment
>>> information will be really useful to understand the overall bpf
>>> deployment in the system.
>>>
>>> There is a name field (16 bytes) for each program, which could
>>> be used to encode the attachment point. There are some drawbacks
>>> for this approaches. First, bpftool user (e.g., an admin) may not
>>> really understand the association between the name and the
>>> attachment point. Second, if one program is attached to multiple
>>> places, encoding a proper name which can imply all these
>>> attachments becomes difficult.
>>>
>>> This patch introduces a new bpf subcommand BPF_PERF_EVENT_QUERY.
>>> Given a pid and fd, if the <pid, fd> is associated with a
>>> tracepoint/kprobe/uprobea perf event, BPF_PERF_EVENT_QUERY will return
>>>     . prog_id
>>>     . tracepoint name, or
>>>     . k[ret]probe funcname + offset or kernel addr, or
>>>     . u[ret]probe filename + offset
>>> to the userspace.
>>> The user can use "bpftool prog" to find more information about
>>> bpf program itself with prog_id.
>>>
>>> Signed-off-by: Yonghong Song <yhs@fb.com>
>>> ---
>>>   include/linux/trace_events.h |  15 ++++++
>>>   include/uapi/linux/bpf.h     |  25 ++++++++++
>>>   kernel/bpf/syscall.c         | 113 +++++++++++++++++++++++++++++++++++++++++++
>>>   kernel/trace/bpf_trace.c     |  53 ++++++++++++++++++++
>>>   kernel/trace/trace_kprobe.c  |  29 +++++++++++
>>>   kernel/trace/trace_uprobe.c  |  22 +++++++++
>>>   6 files changed, 257 insertions(+)
>>
>> Why is the command called *_PERF_EVENT_* ? Are there not a lot of !perf
>> places to attach BPF proglets?
> 
> Just gave a complete picture, the below are major places to attach
> BPF programs:
>    . perf based (through perf ioctl)
>    . raw tracepoint based (through bpf interface)
> 
>    . netlink interface for tc, xdp, tunneling
>    . setsockopt for socket filters
>    . cgroup based (bpf attachment subcommand)
>      mostly networking and io devices
>    . some other networking socket related (sk_skb stream/parser/verdict,
>      sk_msg verdict) through bpf attachment subcommand.
> 
> Currently, for cgroup based attachment, we have BPF_PROG_QUERY with input cgroup file descriptor. For other networking based queries, we
> may need to enumerate tc filters, networking devices, open sockets, etc.
> to get the attachment information.
> 
> So to have one BPF_QUERY command line may be too complex to
> cover all cases.
> 
> But you are right that BPF_PERF_EVENT_QUERY name is too narrow since
> it should be used for other (pid, fd) based queries as well (e.g., socket, or other potential uses in the future).
> 
> How about the subcommand name BPF_TASK_FD_QUERY and make bpf_attr.task_fd_query extensible?

I like the introspection output it provides in 7/7, it's really great!
So the query interface would only ever be tied to BPF progs whose attach
life time is tied to the life time of the application and as soon as all
refs on the fd are released it's unloaded from the system. BPF_TASK_FD_QUERY
seems okay to me, or something like BPF_ATTACH_QUERY. Even if the name is
slightly more generic, it might be more fitting with other cmds like
BPF_PROG_QUERY we have where we tell an attach point to retrieve all progs
from it (though only tied to cgroups right now, it may not be in future).

For all the others that are not strictly tied to the task but global, bpftool
would then need to be extended to query the various other interfaces like
netlink for retrieval which is on todo for some point in future as well. So
this set nicely complements this introspection aspect.

Thanks,
Daniel
Yonghong Song May 17, 2018, 5:50 p.m. | #4
On 5/17/18 8:32 AM, Daniel Borkmann wrote:
> On 05/16/2018 11:59 PM, Yonghong Song wrote:
>> On 5/16/18 4:27 AM, Peter Zijlstra wrote:
>>> On Tue, May 15, 2018 at 04:45:16PM -0700, Yonghong Song wrote:
>>>> Currently, suppose a userspace application has loaded a bpf program
>>>> and attached it to a tracepoint/kprobe/uprobe, and a bpf
>>>> introspection tool, e.g., bpftool, wants to show which bpf program
>>>> is attached to which tracepoint/kprobe/uprobe. Such attachment
>>>> information will be really useful to understand the overall bpf
>>>> deployment in the system.
>>>>
>>>> There is a name field (16 bytes) for each program, which could
>>>> be used to encode the attachment point. There are some drawbacks
>>>> for this approaches. First, bpftool user (e.g., an admin) may not
>>>> really understand the association between the name and the
>>>> attachment point. Second, if one program is attached to multiple
>>>> places, encoding a proper name which can imply all these
>>>> attachments becomes difficult.
>>>>
>>>> This patch introduces a new bpf subcommand BPF_PERF_EVENT_QUERY.
>>>> Given a pid and fd, if the <pid, fd> is associated with a
>>>> tracepoint/kprobe/uprobea perf event, BPF_PERF_EVENT_QUERY will return
>>>>      . prog_id
>>>>      . tracepoint name, or
>>>>      . k[ret]probe funcname + offset or kernel addr, or
>>>>      . u[ret]probe filename + offset
>>>> to the userspace.
>>>> The user can use "bpftool prog" to find more information about
>>>> bpf program itself with prog_id.
>>>>
>>>> Signed-off-by: Yonghong Song <yhs@fb.com>
>>>> ---
>>>>    include/linux/trace_events.h |  15 ++++++
>>>>    include/uapi/linux/bpf.h     |  25 ++++++++++
>>>>    kernel/bpf/syscall.c         | 113 +++++++++++++++++++++++++++++++++++++++++++
>>>>    kernel/trace/bpf_trace.c     |  53 ++++++++++++++++++++
>>>>    kernel/trace/trace_kprobe.c  |  29 +++++++++++
>>>>    kernel/trace/trace_uprobe.c  |  22 +++++++++
>>>>    6 files changed, 257 insertions(+)
>>>
>>> Why is the command called *_PERF_EVENT_* ? Are there not a lot of !perf
>>> places to attach BPF proglets?
>>
>> Just gave a complete picture, the below are major places to attach
>> BPF programs:
>>     . perf based (through perf ioctl)
>>     . raw tracepoint based (through bpf interface)
>>
>>     . netlink interface for tc, xdp, tunneling
>>     . setsockopt for socket filters
>>     . cgroup based (bpf attachment subcommand)
>>       mostly networking and io devices
>>     . some other networking socket related (sk_skb stream/parser/verdict,
>>       sk_msg verdict) through bpf attachment subcommand.
>>
>> Currently, for cgroup based attachment, we have BPF_PROG_QUERY with input cgroup file descriptor. For other networking based queries, we
>> may need to enumerate tc filters, networking devices, open sockets, etc.
>> to get the attachment information.
>>
>> So to have one BPF_QUERY command line may be too complex to
>> cover all cases.
>>
>> But you are right that BPF_PERF_EVENT_QUERY name is too narrow since
>> it should be used for other (pid, fd) based queries as well (e.g., socket, or other potential uses in the future).
>>
>> How about the subcommand name BPF_TASK_FD_QUERY and make bpf_attr.task_fd_query extensible?
> 
> I like the introspection output it provides in 7/7, it's really great!
> So the query interface would only ever be tied to BPF progs whose attach
> life time is tied to the life time of the application and as soon as all
> refs on the fd are released it's unloaded from the system. BPF_TASK_FD_QUERY
> seems okay to me, or something like BPF_ATTACH_QUERY. Even if the name is
> slightly more generic, it might be more fitting with other cmds like
> BPF_PROG_QUERY we have where we tell an attach point to retrieve all progs
> from it (though only tied to cgroups right now, it may not be in future).

I think BPF_TASK_FD_QUERY is okay. Using BPF_ATTACH_QUERY indeed seems
a little bit broader to me as other query subcommands are possible to
query attachments with different input.

BPF_PROG_QUERY is also trying to query attachment. Currently, given a 
cgroup fd, it will query prog array attached. Sean has the patch to 
attach bpf programs to a RC device, and given a device fd, it will
query prog array attached to that device.

> 
> For all the others that are not strictly tied to the task but global, bpftool
> would then need to be extended to query the various other interfaces like
> netlink for retrieval which is on todo for some point in future as well. So
> this set nicely complements this introspection aspect.

Totally agree.
Thanks!

> 
> Thanks,
> Daniel
>
kbuild test robot May 17, 2018, 11:52 p.m. | #5
Hi Yonghong,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on bpf-next/master]

url:    https://github.com/0day-ci/linux/commits/Yonghong-Song/bpf-implement-BPF_PERF_EVENT_QUERY-for-perf-event-query/20180518-060508
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: i386-randconfig-x000-201819 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All warnings (new ones prefixed by >>):

   kernel/trace/trace_kprobe.c: In function 'bpf_get_kprobe_info':
>> kernel/trace/trace_kprobe.c:1315:17: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
      *probe_addr = (u64)tk->rp.kp.addr;
                    ^

vim +1315 kernel/trace/trace_kprobe.c

  1290	
  1291	int bpf_get_kprobe_info(struct perf_event *event, u32 *prog_info,
  1292				const char **symbol, u64 *probe_offset,
  1293				u64 *probe_addr, bool perf_type_tracepoint)
  1294	{
  1295		const char *pevent = trace_event_name(event->tp_event);
  1296		const char *group = event->tp_event->class->system;
  1297		struct trace_kprobe *tk;
  1298	
  1299		if (perf_type_tracepoint)
  1300			tk = find_trace_kprobe(pevent, group);
  1301		else
  1302			tk = event->tp_event->data;
  1303		if (!tk)
  1304			return -EINVAL;
  1305	
  1306		*prog_info = trace_kprobe_is_return(tk) ? BPF_PERF_INFO_KRETPROBE
  1307							: BPF_PERF_INFO_KPROBE;
  1308		if (tk->symbol) {
  1309			*symbol = tk->symbol;
  1310			*probe_offset = tk->rp.kp.offset;
  1311			*probe_addr = 0;
  1312		} else {
  1313			*symbol = NULL;
  1314			*probe_offset = 0;
> 1315			*probe_addr = (u64)tk->rp.kp.addr;
  1316		}
  1317		return 0;
  1318	}
  1319	#endif	/* CONFIG_PERF_EVENTS */
  1320	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

Patch

diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
index 2bde3ef..ec1f604 100644
--- a/include/linux/trace_events.h
+++ b/include/linux/trace_events.h
@@ -473,6 +473,9 @@  int perf_event_query_prog_array(struct perf_event *event, void __user *info);
 int bpf_probe_register(struct bpf_raw_event_map *btp, struct bpf_prog *prog);
 int bpf_probe_unregister(struct bpf_raw_event_map *btp, struct bpf_prog *prog);
 struct bpf_raw_event_map *bpf_find_raw_tracepoint(const char *name);
+int bpf_get_perf_event_info(struct file *file, u32 *prog_id, u32 *prog_info,
+			    const char **buf, u64 *probe_offset,
+			    u64 *probe_addr);
 #else
 static inline unsigned int trace_call_bpf(struct trace_event_call *call, void *ctx)
 {
@@ -504,6 +507,12 @@  static inline struct bpf_raw_event_map *bpf_find_raw_tracepoint(const char *name
 {
 	return NULL;
 }
+static inline int bpf_get_perf_event_info(struct file *file, u32 *prog_id,
+					  u32 *prog_info, const char **buf,
+					  u64 *probe_offset, u64 *probe_addr)
+{
+	return -EOPNOTSUPP;
+}
 #endif
 
 enum {
@@ -560,10 +569,16 @@  extern void perf_trace_del(struct perf_event *event, int flags);
 #ifdef CONFIG_KPROBE_EVENTS
 extern int  perf_kprobe_init(struct perf_event *event, bool is_retprobe);
 extern void perf_kprobe_destroy(struct perf_event *event);
+extern int bpf_get_kprobe_info(struct perf_event *event, u32 *prog_info,
+			       const char **symbol, u64 *probe_offset,
+			       u64 *probe_addr, bool perf_type_tracepoint);
 #endif
 #ifdef CONFIG_UPROBE_EVENTS
 extern int  perf_uprobe_init(struct perf_event *event, bool is_retprobe);
 extern void perf_uprobe_destroy(struct perf_event *event);
+extern int bpf_get_uprobe_info(struct perf_event *event, u32 *prog_info,
+			       const char **filename, u64 *probe_offset,
+			       bool perf_type_tracepoint);
 #endif
 extern int  ftrace_profile_set_filter(struct perf_event *event, int event_id,
 				     char *filter_str);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index d94d333..b78eca1 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -97,6 +97,7 @@  enum bpf_cmd {
 	BPF_RAW_TRACEPOINT_OPEN,
 	BPF_BTF_LOAD,
 	BPF_BTF_GET_FD_BY_ID,
+	BPF_PERF_EVENT_QUERY,
 };
 
 enum bpf_map_type {
@@ -379,6 +380,22 @@  union bpf_attr {
 		__u32		btf_log_size;
 		__u32		btf_log_level;
 	};
+
+	struct {
+		int		pid;		/* input: pid */
+		int		fd;		/* input: fd */
+		__u32		flags;		/* input: flags */
+		__u32		buf_len;	/* input: buf len */
+		__aligned_u64	buf;		/* input/output:
+						 *   tp_name for tracepoint
+						 *   symbol for kprobe
+						 *   filename for uprobe
+						 */
+		__u32		prog_id;	/* output: prod_id */
+		__u32		prog_info;	/* output: BPF_PERF_INFO_* */
+		__u64		probe_offset;	/* output: probe_offset */
+		__u64		probe_addr;	/* output: probe_addr */
+	} perf_event_query;
 } __attribute__((aligned(8)));
 
 /* The description below is an attempt at providing documentation to eBPF
@@ -2450,4 +2467,12 @@  struct bpf_fib_lookup {
 	__u8	dmac[6];     /* ETH_ALEN */
 };
 
+enum {
+	BPF_PERF_INFO_TP_NAME,		/* tp name */
+	BPF_PERF_INFO_KPROBE,		/* (symbol + offset) or addr */
+	BPF_PERF_INFO_KRETPROBE,	/* (symbol + offset) or addr */
+	BPF_PERF_INFO_UPROBE,		/* filename + offset */
+	BPF_PERF_INFO_URETPROBE,	/* filename + offset */
+};
+
 #endif /* _UAPI__LINUX_BPF_H__ */
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index e2aeb5e..347e4d2 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -18,7 +18,9 @@ 
 #include <linux/vmalloc.h>
 #include <linux/mmzone.h>
 #include <linux/anon_inodes.h>
+#include <linux/fdtable.h>
 #include <linux/file.h>
+#include <linux/fs.h>
 #include <linux/license.h>
 #include <linux/filter.h>
 #include <linux/version.h>
@@ -2093,6 +2095,114 @@  static int bpf_btf_get_fd_by_id(const union bpf_attr *attr)
 	return btf_get_fd_by_id(attr->btf_id);
 }
 
+static int bpf_perf_event_info_copy(const union bpf_attr *attr,
+				    union bpf_attr __user *uattr,
+				    u32 prog_id, u32 prog_info,
+				    const char *buf, u64 probe_offset,
+				    u64 probe_addr)
+{
+	__u64 __user *ubuf;
+	int len;
+
+	ubuf = u64_to_user_ptr(attr->perf_event_query.buf);
+	if (buf) {
+		len = strlen(buf);
+		if (attr->perf_event_query.buf_len < len + 1)
+			return -ENOSPC;
+		if (copy_to_user(ubuf, buf, len + 1))
+			return -EFAULT;
+	} else if (attr->perf_event_query.buf_len) {
+		/* copy '\0' to ubuf */
+		__u8 zero = 0;
+
+		if (copy_to_user(ubuf, &zero, 1))
+			return -EFAULT;
+	}
+
+	if (copy_to_user(&uattr->perf_event_query.prog_id, &prog_id,
+			 sizeof(prog_id)) ||
+	    copy_to_user(&uattr->perf_event_query.prog_info, &prog_info,
+			 sizeof(prog_info)) ||
+	    copy_to_user(&uattr->perf_event_query.probe_offset, &probe_offset,
+			 sizeof(probe_offset)) ||
+	    copy_to_user(&uattr->perf_event_query.probe_addr, &probe_addr,
+			 sizeof(probe_addr)))
+		return -EFAULT;
+
+	return 0;
+}
+
+#define BPF_PERF_EVENT_QUERY_LAST_FIELD perf_event_query.probe_addr
+
+static int bpf_perf_event_query(const union bpf_attr *attr,
+				union bpf_attr __user *uattr)
+{
+	pid_t pid = attr->perf_event_query.pid;
+	int fd = attr->perf_event_query.fd;
+	struct files_struct *files;
+	struct task_struct *task;
+	struct file *file;
+	int err;
+
+	if (CHECK_ATTR(BPF_PERF_EVENT_QUERY))
+		return -EINVAL;
+
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	task = get_pid_task(find_vpid(pid), PIDTYPE_PID);
+	if (!task)
+		return -ENOENT;
+
+	files = get_files_struct(task);
+	put_task_struct(task);
+	if (!files)
+		return -ENOENT;
+
+	err = 0;
+	spin_lock(&files->file_lock);
+	file = fcheck_files(files, fd);
+	if (!file)
+		err = -ENOENT;
+	else
+		get_file(file);
+	spin_unlock(&files->file_lock);
+	put_files_struct(files);
+
+	if (err)
+		goto out;
+
+	if (file->f_op == &bpf_raw_tp_fops) {
+		struct bpf_raw_tracepoint *raw_tp = file->private_data;
+		struct bpf_raw_event_map *btp = raw_tp->btp;
+
+		if (!raw_tp->prog)
+			err = -ENOENT;
+		else
+			err = bpf_perf_event_info_copy(attr, uattr,
+						       raw_tp->prog->aux->id,
+						       BPF_PERF_INFO_TP_NAME,
+						       btp->tp->name, 0, 0);
+	} else {
+		u64 probe_offset, probe_addr;
+		u32 prog_id, prog_info;
+		const char *buf;
+
+		err = bpf_get_perf_event_info(file, &prog_id, &prog_info,
+					      &buf, &probe_offset,
+					      &probe_addr);
+		if (!err)
+			err = bpf_perf_event_info_copy(attr, uattr, prog_id,
+						       prog_info, buf,
+						       probe_offset,
+						       probe_addr);
+	}
+
+	fput(file);
+out:
+	return err;
+}
+
 SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, size)
 {
 	union bpf_attr attr = {};
@@ -2179,6 +2289,9 @@  SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
 	case BPF_BTF_GET_FD_BY_ID:
 		err = bpf_btf_get_fd_by_id(&attr);
 		break;
+	case BPF_PERF_EVENT_QUERY:
+		err = bpf_perf_event_query(&attr, uattr);
+		break;
 	default:
 		err = -EINVAL;
 		break;
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index ce2cbbf..7e8121e 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -14,6 +14,7 @@ 
 #include <linux/uaccess.h>
 #include <linux/ctype.h>
 #include <linux/kprobes.h>
+#include <linux/syscalls.h>
 #include <linux/error-injection.h>
 
 #include "trace_probe.h"
@@ -1163,3 +1164,55 @@  int bpf_probe_unregister(struct bpf_raw_event_map *btp, struct bpf_prog *prog)
 	mutex_unlock(&bpf_event_mutex);
 	return err;
 }
+
+int bpf_get_perf_event_info(struct file *file, u32 *prog_id, u32 *prog_info,
+			    const char **buf, u64 *probe_offset,
+			    u64 *probe_addr)
+{
+	bool is_tracepoint, is_syscall_tp;
+	struct perf_event *event;
+	struct bpf_prog *prog;
+	int flags, err = 0;
+
+	event = perf_get_event(file);
+	if (IS_ERR(event))
+		return PTR_ERR(event);
+
+	prog = event->prog;
+	if (!prog)
+		return -ENOENT;
+
+	/* not supporting BPF_PROG_TYPE_PERF_EVENT yet */
+	if (prog->type == BPF_PROG_TYPE_PERF_EVENT)
+		return -EOPNOTSUPP;
+
+	*prog_id = prog->aux->id;
+	flags = event->tp_event->flags;
+	is_tracepoint = flags & TRACE_EVENT_FL_TRACEPOINT;
+	is_syscall_tp = is_syscall_trace_event(event->tp_event);
+
+	if (is_tracepoint || is_syscall_tp) {
+		*buf = is_tracepoint ? event->tp_event->tp->name
+				     : event->tp_event->name;
+		*prog_info = BPF_PERF_INFO_TP_NAME;
+		*probe_offset = 0x0;
+		*probe_addr = 0x0;
+	} else {
+		/* kprobe/uprobe */
+		err = -EOPNOTSUPP;
+#ifdef CONFIG_KPROBE_EVENTS
+		if (flags & TRACE_EVENT_FL_KPROBE)
+			err = bpf_get_kprobe_info(event, prog_info, buf,
+						  probe_offset, probe_addr,
+						  event->attr.type == PERF_TYPE_TRACEPOINT);
+#endif
+#ifdef CONFIG_UPROBE_EVENTS
+		if (flags & TRACE_EVENT_FL_UPROBE)
+			err = bpf_get_uprobe_info(event, prog_info, buf,
+						  probe_offset,
+						  event->attr.type == PERF_TYPE_TRACEPOINT);
+#endif
+	}
+
+	return err;
+}
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 02aed76..595d154 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1287,6 +1287,35 @@  kretprobe_perf_func(struct trace_kprobe *tk, struct kretprobe_instance *ri,
 			      head, NULL);
 }
 NOKPROBE_SYMBOL(kretprobe_perf_func);
+
+int bpf_get_kprobe_info(struct perf_event *event, u32 *prog_info,
+			const char **symbol, u64 *probe_offset,
+			u64 *probe_addr, bool perf_type_tracepoint)
+{
+	const char *pevent = trace_event_name(event->tp_event);
+	const char *group = event->tp_event->class->system;
+	struct trace_kprobe *tk;
+
+	if (perf_type_tracepoint)
+		tk = find_trace_kprobe(pevent, group);
+	else
+		tk = event->tp_event->data;
+	if (!tk)
+		return -EINVAL;
+
+	*prog_info = trace_kprobe_is_return(tk) ? BPF_PERF_INFO_KRETPROBE
+						: BPF_PERF_INFO_KPROBE;
+	if (tk->symbol) {
+		*symbol = tk->symbol;
+		*probe_offset = tk->rp.kp.offset;
+		*probe_addr = 0;
+	} else {
+		*symbol = NULL;
+		*probe_offset = 0;
+		*probe_addr = (u64)tk->rp.kp.addr;
+	}
+	return 0;
+}
 #endif	/* CONFIG_PERF_EVENTS */
 
 /*
diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index ac89287..e781a9f 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -1161,6 +1161,28 @@  static void uretprobe_perf_func(struct trace_uprobe *tu, unsigned long func,
 {
 	__uprobe_perf_func(tu, func, regs, ucb, dsize);
 }
+
+int bpf_get_uprobe_info(struct perf_event *event, u32 *prog_info,
+			const char **filename, u64 *probe_offset,
+			bool perf_type_tracepoint)
+{
+	const char *pevent = trace_event_name(event->tp_event);
+	const char *group = event->tp_event->class->system;
+	struct trace_uprobe *tu;
+
+	if (perf_type_tracepoint)
+		tu = find_probe_event(pevent, group);
+	else
+		tu = event->tp_event->data;
+	if (!tu)
+		return -EINVAL;
+
+	*prog_info = is_ret_probe(tu) ? BPF_PERF_INFO_URETPROBE
+				      : BPF_PERF_INFO_UPROBE;
+	*filename = tu->filename;
+	*probe_offset = tu->offset;
+	return 0;
+}
 #endif	/* CONFIG_PERF_EVENTS */
 
 static int