From patchwork Wed Nov 15 17:23:33 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Liu X-Patchwork-Id: 838267 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=fb.com header.i=@fb.com header.b="G5qpW9SZ"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3ycWV805RBz9s7M for ; Thu, 16 Nov 2017 04:25:20 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758384AbdKORZQ (ORCPT ); Wed, 15 Nov 2017 12:25:16 -0500 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:45752 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758117AbdKORX6 (ORCPT ); Wed, 15 Nov 2017 12:23:58 -0500 Received: from pps.filterd (m0044008.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id vAFHNuvo026854 for ; Wed, 15 Nov 2017 09:23:57 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type; s=facebook; bh=HJN1JeuPQ+lCnNvIb01WmdGVEALPeqDQ2NJWbG4E3gE=; b=G5qpW9SZnQFWVGFyJbQ/afxi1BSd1SJBGE1vcvherV1cyzWvG4GmlaGCdEEBSU8N/tk8 hPxgN2c4x8zbGPhvjYMvImxeYUXtW+0SPYkcq6uX+Q6raIH4jMx/0E7b9gm51ugahJCz wurfsFQdlbNp0gpxezRDj8VeGV02BX55Xc0= Received: from mail.thefacebook.com ([199.201.64.23]) by mx0a-00082601.pphosted.com with ESMTP id 2e8rh6gb17-1 (version=TLSv1 cipher=ECDHE-RSA-AES256-SHA bits=256 verify=NOT) for ; Wed, 15 Nov 2017 09:23:57 -0800 Received: from mx-out.facebook.com (192.168.52.123) by PRN-CHUB10.TheFacebook.com (192.168.16.20) with Microsoft SMTP Server id 14.3.361.1; Wed, 15 Nov 2017 09:23:46 -0800 Received: by devbig102.frc2.facebook.com (Postfix, from userid 4523) id 60E9042824EB; Wed, 15 Nov 2017 09:23:46 -0800 (PST) Smtp-Origin-Hostprefix: devbig From: Song Liu Smtp-Origin-Hostname: devbig102.frc2.facebook.com To: , , , , , , CC: , Song Liu Smtp-Origin-Cluster: frc2c02 Subject: [PATCH 1/6] perf: Add new type PERF_TYPE_PROBE Date: Wed, 15 Nov 2017 09:23:33 -0800 Message-ID: <20171115172339.1791161-3-songliubraving@fb.com> X-Mailer: git-send-email 2.9.5 In-Reply-To: <20171115172339.1791161-1-songliubraving@fb.com> References: <20171115172339.1791161-1-songliubraving@fb.com> X-FB-Internal: Safe MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-11-15_09:, , signatures=0 X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org A new perf type PERF_TYPE_PROBE is added to allow creating [k,u]probe with perf_event_open. These [k,u]probe are associated with the file decriptor created by perf_event_open, thus are easy to clean when the file descriptor is destroyed. Struct probe_desc and two flags, is_uprobe and is_return, are added to describe the probe being created with perf_event_open. Note: We use type __u64 for pointer probe_desc instead of __aligned_u64. The reason here is to avoid changing the size of struct perf_event_attr, and breaking new-kernel-old-utility scenario. To avoid alignment problem with the pointer, we will (in the following patches) copy probe_desc to __aligned_u64 before using it as pointer. Signed-off-by: Song Liu Reviewed-by: Yonghong Song Reviewed-by: Josef Bacik Acked-by: Alexei Starovoitov --- include/uapi/linux/perf_event.h | 35 +++++++++++++++++++++++++++++++++-- 1 file changed, 33 insertions(+), 2 deletions(-) diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index 362493a..cc42d59 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -33,6 +33,7 @@ enum perf_type_id { PERF_TYPE_HW_CACHE = 3, PERF_TYPE_RAW = 4, PERF_TYPE_BREAKPOINT = 5, + PERF_TYPE_PROBE = 6, PERF_TYPE_MAX, /* non-ABI */ }; @@ -299,6 +300,29 @@ enum perf_event_read_format { #define PERF_ATTR_SIZE_VER4 104 /* add: sample_regs_intr */ #define PERF_ATTR_SIZE_VER5 112 /* add: aux_watermark */ +#define MAX_PROBE_FUNC_NAME_LEN 64 +/* + * Describe a kprobe or uprobe for PERF_TYPE_PROBE. + * perf_event_attr.probe_desc will point to this structure. is_uprobe + * and is_return are used to differentiate different types of probe + * (k/u, probe/retprobe). + * + * The two unions should be used as follows: + * For uprobe: use path and offset; + * For kprobe: if func is empty, use addr + * if func is not emtpy, use func and offset + */ +struct probe_desc { + union { + __aligned_u64 func; + __aligned_u64 path; + }; + union { + __aligned_u64 addr; + __u64 offset; + }; +}; + /* * Hardware event_id to monitor via a performance monitoring event: * @@ -320,7 +344,10 @@ struct perf_event_attr { /* * Type specific configuration information. */ - __u64 config; + union { + __u64 config; + __u64 probe_desc; /* ptr to struct probe_desc */ + }; union { __u64 sample_period; @@ -370,7 +397,11 @@ struct perf_event_attr { context_switch : 1, /* context switch data */ write_backward : 1, /* Write ring buffer from end to beginning */ namespaces : 1, /* include namespaces data */ - __reserved_1 : 35; + + /* For PERF_TYPE_PROBE */ + is_uprobe : 1, /* 0: kprobe, 1: uprobe */ + is_return : 1, /* 0: probe, 1: retprobe */ + __reserved_1 : 33; union { __u32 wakeup_events; /* wakeup every n events */ From patchwork Wed Nov 15 17:23:35 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Liu X-Patchwork-Id: 838264 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=fb.com header.i=@fb.com header.b="SxL9Zvmx"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3ycWSX3GdRz9s7M for ; Thu, 16 Nov 2017 04:23:56 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758099AbdKORXy (ORCPT ); Wed, 15 Nov 2017 12:23:54 -0500 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:37374 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757489AbdKORXs (ORCPT ); Wed, 15 Nov 2017 12:23:48 -0500 Received: from pps.filterd (m0044010.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id vAFHJrEN008867 for ; Wed, 15 Nov 2017 09:23:48 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type; s=facebook; bh=v63xu/KRPzslKNNLOOi5Y0NW3ycpodKLN5nH25UcNVY=; b=SxL9ZvmxIF2zfyHj/pOlsHrUSINaXvgB5aHWS9UG/TcarE2gzYIK/OBz7CvAHPv6VdMl gGYYL91boaqfvJrz5CShjlCxogKDZo/QP9EBSrUKPPQf3WvqtQnXyOm7XKzaleWiHirZ WkFNtr9RfKOKsBqeLdekgcVGzdKg6p0uuko= Received: from mail.thefacebook.com ([199.201.64.23]) by mx0a-00082601.pphosted.com with ESMTP id 2e8qbu8mau-4 (version=TLSv1 cipher=ECDHE-RSA-AES256-SHA bits=256 verify=NOT) for ; Wed, 15 Nov 2017 09:23:48 -0800 Received: from mx-out.facebook.com (192.168.52.123) by PRN-CHUB13.TheFacebook.com (192.168.16.23) with Microsoft SMTP Server id 14.3.361.1; Wed, 15 Nov 2017 09:23:46 -0800 Received: by devbig102.frc2.facebook.com (Postfix, from userid 4523) id 9005942824EB; Wed, 15 Nov 2017 09:23:46 -0800 (PST) Smtp-Origin-Hostprefix: devbig From: Song Liu Smtp-Origin-Hostname: devbig102.frc2.facebook.com To: , , , , , , CC: , Song Liu Smtp-Origin-Cluster: frc2c02 Subject: [PATCH 2/6] perf: copy new perf_event.h to tools/include/uapi Date: Wed, 15 Nov 2017 09:23:35 -0800 Message-ID: <20171115172339.1791161-5-songliubraving@fb.com> X-Mailer: git-send-email 2.9.5 In-Reply-To: <20171115172339.1791161-1-songliubraving@fb.com> References: <20171115172339.1791161-1-songliubraving@fb.com> X-FB-Internal: Safe MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-11-15_09:, , signatures=0 X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org perf_event.h is updated in previous patch, this patch applies same changes to the tools/ version. This is part is put in a separate patch in case the two files are back ported separately. Signed-off-by: Song Liu Reviewed-by: Yonghong Song Reviewed-by: Josef Bacik Acked-by: Alexei Starovoitov --- tools/include/uapi/linux/perf_event.h | 35 +++++++++++++++++++++++++++++++++-- 1 file changed, 33 insertions(+), 2 deletions(-) diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h index 362493a..cc42d59 100644 --- a/tools/include/uapi/linux/perf_event.h +++ b/tools/include/uapi/linux/perf_event.h @@ -33,6 +33,7 @@ enum perf_type_id { PERF_TYPE_HW_CACHE = 3, PERF_TYPE_RAW = 4, PERF_TYPE_BREAKPOINT = 5, + PERF_TYPE_PROBE = 6, PERF_TYPE_MAX, /* non-ABI */ }; @@ -299,6 +300,29 @@ enum perf_event_read_format { #define PERF_ATTR_SIZE_VER4 104 /* add: sample_regs_intr */ #define PERF_ATTR_SIZE_VER5 112 /* add: aux_watermark */ +#define MAX_PROBE_FUNC_NAME_LEN 64 +/* + * Describe a kprobe or uprobe for PERF_TYPE_PROBE. + * perf_event_attr.probe_desc will point to this structure. is_uprobe + * and is_return are used to differentiate different types of probe + * (k/u, probe/retprobe). + * + * The two unions should be used as follows: + * For uprobe: use path and offset; + * For kprobe: if func is empty, use addr + * if func is not emtpy, use func and offset + */ +struct probe_desc { + union { + __aligned_u64 func; + __aligned_u64 path; + }; + union { + __aligned_u64 addr; + __u64 offset; + }; +}; + /* * Hardware event_id to monitor via a performance monitoring event: * @@ -320,7 +344,10 @@ struct perf_event_attr { /* * Type specific configuration information. */ - __u64 config; + union { + __u64 config; + __u64 probe_desc; /* ptr to struct probe_desc */ + }; union { __u64 sample_period; @@ -370,7 +397,11 @@ struct perf_event_attr { context_switch : 1, /* context switch data */ write_backward : 1, /* Write ring buffer from end to beginning */ namespaces : 1, /* include namespaces data */ - __reserved_1 : 35; + + /* For PERF_TYPE_PROBE */ + is_uprobe : 1, /* 0: kprobe, 1: uprobe */ + is_return : 1, /* 0: probe, 1: retprobe */ + __reserved_1 : 33; union { __u32 wakeup_events; /* wakeup every n events */ From patchwork Wed Nov 15 17:23:36 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Liu X-Patchwork-Id: 838270 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=fb.com header.i=@fb.com header.b="d6sTRLPv"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3ycWWS5jd2z9t2W for ; Thu, 16 Nov 2017 04:26:27 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758562AbdKORZg (ORCPT ); Wed, 15 Nov 2017 12:25:36 -0500 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:34427 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758078AbdKORXt (ORCPT ); Wed, 15 Nov 2017 12:23:49 -0500 Received: from pps.filterd (m0044012.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id vAFHKimW000519 for ; Wed, 15 Nov 2017 09:23:48 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type; s=facebook; bh=ICglcPiBgF92E2AJlZkwy7gAmXVmlXMM0oWccsdWDb4=; b=d6sTRLPvyD7e3k5ncPIbElm++BGzEv/7bTwRt3J5Y7xY2vWFuxMLTDFjPJLdVoDvPot+ VApmV2szv/proPI4zzQfLq5RKTYWsFv8cKOYbZ5CggzW1IxPkXeMux88NB5DF/joLKSN DJCxVDArqF8EqYv9gSScEYMT04ilHAM6mWg= Received: from mail.thefacebook.com ([199.201.64.23]) by mx0a-00082601.pphosted.com with ESMTP id 2e8rntg9ww-3 (version=TLSv1 cipher=ECDHE-RSA-AES256-SHA bits=256 verify=NOT) for ; Wed, 15 Nov 2017 09:23:48 -0800 Received: from mx-out.facebook.com (192.168.52.123) by PRN-CHUB05.TheFacebook.com (192.168.16.15) with Microsoft SMTP Server id 14.3.361.1; Wed, 15 Nov 2017 09:23:46 -0800 Received: by devbig102.frc2.facebook.com (Postfix, from userid 4523) id A68E542824EB; Wed, 15 Nov 2017 09:23:46 -0800 (PST) Smtp-Origin-Hostprefix: devbig From: Song Liu Smtp-Origin-Hostname: devbig102.frc2.facebook.com To: , , , , , , CC: , Song Liu Smtp-Origin-Cluster: frc2c02 Subject: [PATCH 3/6] perf: implement kprobe support to PERF_TYPE_PROBE Date: Wed, 15 Nov 2017 09:23:36 -0800 Message-ID: <20171115172339.1791161-6-songliubraving@fb.com> X-Mailer: git-send-email 2.9.5 In-Reply-To: <20171115172339.1791161-1-songliubraving@fb.com> References: <20171115172339.1791161-1-songliubraving@fb.com> X-FB-Internal: Safe MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-11-15_09:, , signatures=0 X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org A new pmu, perf_probe, is created for PERF_TYPE_PROBE. Based on input from perf_event_open(), perf_probe creates a kprobe (or kretprobe) for the perf_event. This kprobe is private to this perf_event, and thus not added to global lists, and not available in tracefs. Two functions, create_local_trace_kprobe() and destroy_local_trace_kprobe() are added to created and destroy these local trace_kprobe. Signed-off-by: Song Liu Reviewed-by: Yonghong Song Reviewed-by: Josef Bacik --- include/linux/trace_events.h | 2 + kernel/events/core.c | 41 +++++++++++++++++-- kernel/trace/trace_event_perf.c | 81 ++++++++++++++++++++++++++++++++++++ kernel/trace/trace_kprobe.c | 91 +++++++++++++++++++++++++++++++++++++---- kernel/trace/trace_probe.h | 7 ++++ 5 files changed, 211 insertions(+), 11 deletions(-) diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h index 2bcb4dc..743e68d 100644 --- a/include/linux/trace_events.h +++ b/include/linux/trace_events.h @@ -494,6 +494,8 @@ extern int perf_trace_init(struct perf_event *event); extern void perf_trace_destroy(struct perf_event *event); extern int perf_trace_add(struct perf_event *event, int flags); extern void perf_trace_del(struct perf_event *event, int flags); +extern int perf_probe_init(struct perf_event *event); +extern void perf_probe_destroy(struct perf_event *event); extern int ftrace_profile_set_filter(struct perf_event *event, int event_id, char *filter_str); extern void ftrace_profile_free_filter(struct perf_event *event); diff --git a/kernel/events/core.c b/kernel/events/core.c index 81dd57b..95c6610 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -7966,6 +7966,28 @@ static int perf_tp_event_init(struct perf_event *event) return 0; } +static int perf_probe_event_init(struct perf_event *event) +{ + int err; + + if (event->attr.type != PERF_TYPE_PROBE) + return -ENOENT; + + /* + * no branch sampling for probe events + */ + if (has_branch_stack(event)) + return -EOPNOTSUPP; + + err = perf_probe_init(event); + if (err) + return err; + + event->destroy = perf_probe_destroy; + + return 0; +} + static struct pmu perf_tracepoint = { .task_ctx_nr = perf_sw_context, @@ -7977,9 +7999,20 @@ static struct pmu perf_tracepoint = { .read = perf_swevent_read, }; +static struct pmu perf_probe = { + .task_ctx_nr = perf_sw_context, + .event_init = perf_probe_event_init, + .add = perf_trace_add, + .del = perf_trace_del, + .start = perf_swevent_start, + .stop = perf_swevent_stop, + .read = perf_swevent_read, +}; + static inline void perf_tp_register(void) { perf_pmu_register(&perf_tracepoint, "tracepoint", PERF_TYPE_TRACEPOINT); + perf_pmu_register(&perf_probe, "probe", PERF_TYPE_PROBE); } static void perf_event_free_filter(struct perf_event *event) @@ -8061,7 +8094,8 @@ static int perf_event_set_bpf_prog(struct perf_event *event, u32 prog_fd) bool is_kprobe, is_tracepoint, is_syscall_tp; struct bpf_prog *prog; - if (event->attr.type != PERF_TYPE_TRACEPOINT) + if (event->attr.type != PERF_TYPE_TRACEPOINT && + event->attr.type != PERF_TYPE_PROBE) return perf_event_set_bpf_handler(event, prog_fd); if (event->tp_event->prog) @@ -8533,8 +8567,9 @@ static int perf_event_set_filter(struct perf_event *event, void __user *arg) char *filter_str; int ret = -EINVAL; - if ((event->attr.type != PERF_TYPE_TRACEPOINT || - !IS_ENABLED(CONFIG_EVENT_TRACING)) && + if (((event->attr.type != PERF_TYPE_TRACEPOINT && + event->attr.type != PERF_TYPE_PROBE) || + !IS_ENABLED(CONFIG_EVENT_TRACING)) && !has_addr_filter(event)) return -EINVAL; diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c index 13ba2d3..bf9b99b 100644 --- a/kernel/trace/trace_event_perf.c +++ b/kernel/trace/trace_event_perf.c @@ -8,6 +8,7 @@ #include #include #include "trace.h" +#include "trace_probe.h" static char __percpu *perf_trace_buf[PERF_NR_CONTEXTS]; @@ -229,6 +230,74 @@ int perf_trace_init(struct perf_event *p_event) return ret; } +#ifdef CONFIG_KPROBE_EVENTS +static int perf_probe_create_kprobe(struct perf_event *p_event, + struct probe_desc *pd, char *name) +{ + struct trace_event_call *tp_event; + int ret; + + tp_event = create_local_trace_kprobe( + name, (void *)(unsigned long)(pd->addr), pd->offset, + p_event->attr.is_return); + if (IS_ERR(tp_event)) + return PTR_ERR(tp_event); + ret = perf_trace_event_init(tp_event, p_event); + if (ret) + destroy_local_trace_kprobe(tp_event); + + return ret; +} +#else +static int perf_probe_create_kprobe(struct perf_event *p_event, + struct probe_desc *pd, char *name) +{ + return -EOPNOTSUPP; +} +#endif /* CONFIG_KPROBE_EVENTS */ + +int perf_probe_init(struct perf_event *p_event) +{ + struct probe_desc pd; + int ret; + char *name = NULL; + __aligned_u64 aligned_probe_desc; + + /* + * attr.probe_desc may not be 64-bit aligned on 32-bit systems. + * Make an aligned copy of it to before u64_to_user_ptr(). + */ + memcpy(&aligned_probe_desc, &p_event->attr.probe_desc, + sizeof(__aligned_u64)); + + if (copy_from_user(&pd, u64_to_user_ptr(aligned_probe_desc), + sizeof(struct probe_desc))) + return -EFAULT; + + if (pd.func) { + name = kzalloc(MAX_PROBE_FUNC_NAME_LEN, GFP_KERNEL); + if (!name) + return -ENOMEM; + ret = strncpy_from_user(name, u64_to_user_ptr(pd.func), + MAX_PROBE_FUNC_NAME_LEN); + if (ret < 0) + goto out; + + if (name[0] == '\0') { + kfree(name); + name = NULL; + } + } + + if (!p_event->attr.is_uprobe) + ret = perf_probe_create_kprobe(p_event, &pd, name); + else + ret = -EOPNOTSUPP; +out: + kfree(name); + return ret; +} + void perf_trace_destroy(struct perf_event *p_event) { mutex_lock(&event_mutex); @@ -237,6 +306,18 @@ void perf_trace_destroy(struct perf_event *p_event) mutex_unlock(&event_mutex); } +void perf_probe_destroy(struct perf_event *p_event) +{ + perf_trace_event_close(p_event); + perf_trace_event_unreg(p_event); + + if (!p_event->attr.is_uprobe) { +#ifdef CONFIG_KPROBE_EVENTS + destroy_local_trace_kprobe(p_event->tp_event); +#endif + } +} + int perf_trace_add(struct perf_event *p_event, int flags) { struct trace_event_call *tp_event = p_event->tp_event; diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c index 8a907e1..16b334a 100644 --- a/kernel/trace/trace_kprobe.c +++ b/kernel/trace/trace_kprobe.c @@ -438,6 +438,14 @@ disable_trace_kprobe(struct trace_kprobe *tk, struct trace_event_file *file) disable_kprobe(&tk->rp.kp); wait = 1; } + + /* + * if tk is not added to any list, it must be a local trace_kprobe + * created with perf_event_open. We don't need to wait for these + * trace_kprobes + */ + if (list_empty(&tk->list)) + wait = 0; out: if (wait) { /* @@ -1315,12 +1323,9 @@ static struct trace_event_functions kprobe_funcs = { .trace = print_kprobe_event }; -static int register_kprobe_event(struct trace_kprobe *tk) +static inline void init_trace_event_call(struct trace_kprobe *tk, + struct trace_event_call *call) { - struct trace_event_call *call = &tk->tp.call; - int ret; - - /* Initialize trace_event_call */ INIT_LIST_HEAD(&call->class->fields); if (trace_kprobe_is_return(tk)) { call->event.funcs = &kretprobe_funcs; @@ -1329,6 +1334,19 @@ static int register_kprobe_event(struct trace_kprobe *tk) call->event.funcs = &kprobe_funcs; call->class->define_fields = kprobe_event_define_fields; } + + call->flags = TRACE_EVENT_FL_KPROBE; + call->class->reg = kprobe_register; + call->data = tk; +} + +static int register_kprobe_event(struct trace_kprobe *tk) +{ + struct trace_event_call *call = &tk->tp.call; + int ret = 0; + + init_trace_event_call(tk, call); + if (set_print_fmt(&tk->tp, trace_kprobe_is_return(tk)) < 0) return -ENOMEM; ret = register_trace_event(&call->event); @@ -1336,9 +1354,6 @@ static int register_kprobe_event(struct trace_kprobe *tk) kfree(call->print_fmt); return -ENODEV; } - call->flags = TRACE_EVENT_FL_KPROBE; - call->class->reg = kprobe_register; - call->data = tk; ret = trace_add_event_call(call); if (ret) { pr_info("Failed to register kprobe event: %s\n", @@ -1360,6 +1375,66 @@ static int unregister_kprobe_event(struct trace_kprobe *tk) return ret; } +#ifdef CONFIG_PERF_EVENTS +/* create a trace_kprobe, but don't add it to global lists */ +struct trace_event_call * +create_local_trace_kprobe(char *func, void *addr, unsigned long offs, + bool is_return) +{ + struct trace_kprobe *tk; + int ret; + char *event; + + /* + * local trace_kprobes are not added to probe_list, so they are never + * searched in find_trace_kprobe(). Therefore, there is no concern of + * duplicated name here. + */ + event = func ? func : "DUMMY_EVENT"; + + tk = alloc_trace_kprobe(KPROBE_EVENT_SYSTEM, event, (void *)addr, func, + offs, 0 /* maxactive */, 0 /* nargs */, + is_return); + + if (IS_ERR(tk)) { + pr_info("Failed to allocate trace_probe.(%d)\n", + (int)PTR_ERR(tk)); + return ERR_CAST(tk); + } + + init_trace_event_call(tk, &tk->tp.call); + + if (set_print_fmt(&tk->tp, trace_kprobe_is_return(tk)) < 0) { + ret = -ENOMEM; + goto error; + } + + ret = __register_trace_kprobe(tk); + if (ret < 0) + goto error; + + return &tk->tp.call; +error: + free_trace_kprobe(tk); + return ERR_PTR(ret); +} + +void destroy_local_trace_kprobe(struct trace_event_call *event_call) +{ + struct trace_kprobe *tk; + + tk = container_of(event_call, struct trace_kprobe, tp.call); + + if (trace_probe_is_enabled(&tk->tp)) { + WARN_ON(1); + return; + } + + __unregister_trace_kprobe(tk); + free_trace_kprobe(tk); +} +#endif /* CONFIG_PERF_EVENTS */ + /* Make a tracefs interface for controlling probe points */ static __init int init_kprobe_trace(void) { diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h index 903273c..910ae1b 100644 --- a/kernel/trace/trace_probe.h +++ b/kernel/trace/trace_probe.h @@ -411,3 +411,10 @@ store_trace_args(int ent_size, struct trace_probe *tp, struct pt_regs *regs, } extern int set_print_fmt(struct trace_probe *tp, bool is_return); + +#ifdef CONFIG_PERF_EVENTS +extern struct trace_event_call * +create_local_trace_kprobe(char *func, void *addr, unsigned long offs, + bool is_return); +extern void destroy_local_trace_kprobe(struct trace_event_call *event_call); +#endif From patchwork Wed Nov 15 17:23:37 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Liu X-Patchwork-Id: 838268 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=fb.com header.i=@fb.com header.b="oj5VQMTO"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3ycWVV0l5vz9s7M for ; Thu, 16 Nov 2017 04:25:38 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758478AbdKORZW (ORCPT ); Wed, 15 Nov 2017 12:25:22 -0500 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:33924 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1758095AbdKORXv (ORCPT ); Wed, 15 Nov 2017 12:23:51 -0500 Received: from pps.filterd (m0001255.ppops.net [127.0.0.1]) by mx0b-00082601.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id vAFHMPT2032582 for ; Wed, 15 Nov 2017 09:23:50 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type; s=facebook; bh=x849MKHBsWJAi4L5ZLOZ+yDh/HMv6Mocdz8QkBaZN14=; b=oj5VQMTOMWQgaIuZaeVv9l1g7RAsmOqq0M4XP3h0p6cqZhBtEBoQvSGBeEhjOO2sOzfy qJP3N4LQOUVzun+rLF0Th9bQLH/wZyOC3kTCY4AshWwSqnRV9MCAILSXrmt6JXdch/6q 9Q0GcckQexYo/jbaVLx4Zy+/RIq8q4pxSx0= Received: from mail.thefacebook.com ([199.201.64.23]) by mx0b-00082601.pphosted.com with ESMTP id 2e8kqh9b7g-5 (version=TLSv1 cipher=ECDHE-RSA-AES256-SHA bits=256 verify=NOT) for ; Wed, 15 Nov 2017 09:23:50 -0800 Received: from mx-out.facebook.com (192.168.52.123) by PRN-CHUB03.TheFacebook.com (192.168.16.13) with Microsoft SMTP Server id 14.3.361.1; Wed, 15 Nov 2017 09:23:46 -0800 Received: by devbig102.frc2.facebook.com (Postfix, from userid 4523) id BD21442824EB; Wed, 15 Nov 2017 09:23:46 -0800 (PST) Smtp-Origin-Hostprefix: devbig From: Song Liu Smtp-Origin-Hostname: devbig102.frc2.facebook.com To: , , , , , , CC: , Song Liu Smtp-Origin-Cluster: frc2c02 Subject: [PATCH 4/6] perf: implement uprobe support to PERF_TYPE_PROBE Date: Wed, 15 Nov 2017 09:23:37 -0800 Message-ID: <20171115172339.1791161-7-songliubraving@fb.com> X-Mailer: git-send-email 2.9.5 In-Reply-To: <20171115172339.1791161-1-songliubraving@fb.com> References: <20171115172339.1791161-1-songliubraving@fb.com> X-FB-Internal: Safe MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-11-15_09:, , signatures=0 X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This patch adds uprobe support to perf_probe with similar pattern as previous patch (for kprobe). Two functions, create_local_trace_uprobe() and destroy_local_trace_uprobe(), are created so a uprobe can be created and attached to the file descriptor created by perf_event_open(). Signed-off-by: Song Liu Reviewed-by: Yonghong Song Reviewed-by: Josef Bacik --- kernel/trace/trace_event_perf.c | 48 +++++++++++++++++++++- kernel/trace/trace_probe.h | 4 ++ kernel/trace/trace_uprobe.c | 90 ++++++++++++++++++++++++++++++++++++----- 3 files changed, 131 insertions(+), 11 deletions(-) diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c index bf9b99b..4e4de84 100644 --- a/kernel/trace/trace_event_perf.c +++ b/kernel/trace/trace_event_perf.c @@ -256,6 +256,39 @@ static int perf_probe_create_kprobe(struct perf_event *p_event, } #endif /* CONFIG_KPROBE_EVENTS */ +#ifdef CONFIG_UPROBE_EVENTS +static int perf_probe_create_uprobe(struct perf_event *p_event, + struct probe_desc *pd, char *name) +{ + struct trace_event_call *tp_event; + int ret; + + if (!name) + return -EINVAL; + tp_event = create_local_trace_uprobe( + name, pd->offset, p_event->attr.is_return); + if (IS_ERR(tp_event)) + return PTR_ERR(tp_event); + /* + * local trace_uprobe need to hold event_mutex to call + * uprobe_buffer_enable() and uprobe_buffer_disable(). + * event_mutex is not required for local trace_kprobes. + */ + mutex_lock(&event_mutex); + ret = perf_trace_event_init(tp_event, p_event); + if (ret) + destroy_local_trace_uprobe(tp_event); + mutex_unlock(&event_mutex); + return ret; +} +#else +static int perf_probe_create_uprobe(struct perf_event *p_event, + struct probe_desc *pd, char *name) +{ + return -EOPNOTSUPP; +} +#endif /* CONFIG_KPROBE_EVENTS */ + int perf_probe_init(struct perf_event *p_event) { struct probe_desc pd; @@ -292,7 +325,7 @@ int perf_probe_init(struct perf_event *p_event) if (!p_event->attr.is_uprobe) ret = perf_probe_create_kprobe(p_event, &pd, name); else - ret = -EOPNOTSUPP; + ret = perf_probe_create_uprobe(p_event, &pd, name); out: kfree(name); return ret; @@ -308,13 +341,26 @@ void perf_trace_destroy(struct perf_event *p_event) void perf_probe_destroy(struct perf_event *p_event) { + /* + * local trace_uprobe need to hold event_mutex to call + * uprobe_buffer_enable() and uprobe_buffer_disable(). + * event_mutex is not required for local trace_kprobes. + */ + if (p_event->attr.is_uprobe) + mutex_lock(&event_mutex); perf_trace_event_close(p_event); perf_trace_event_unreg(p_event); + if (p_event->attr.is_uprobe) + mutex_unlock(&event_mutex); if (!p_event->attr.is_uprobe) { #ifdef CONFIG_KPROBE_EVENTS destroy_local_trace_kprobe(p_event->tp_event); #endif + } else { +#ifdef CONFIG_UPROBE_EVENTS + destroy_local_trace_uprobe(p_event->tp_event); +#endif } } diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h index 910ae1b..86b5925 100644 --- a/kernel/trace/trace_probe.h +++ b/kernel/trace/trace_probe.h @@ -417,4 +417,8 @@ extern struct trace_event_call * create_local_trace_kprobe(char *func, void *addr, unsigned long offs, bool is_return); extern void destroy_local_trace_kprobe(struct trace_event_call *event_call); + +extern struct trace_event_call * +create_local_trace_uprobe(char *name, unsigned long offs, bool is_return); +extern void destroy_local_trace_uprobe(struct trace_event_call *event_call); #endif diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c index 4525e02..4d805d2 100644 --- a/kernel/trace/trace_uprobe.c +++ b/kernel/trace/trace_uprobe.c @@ -31,8 +31,8 @@ #define UPROBE_EVENT_SYSTEM "uprobes" struct uprobe_trace_entry_head { - struct trace_entry ent; - unsigned long vaddr[]; + struct trace_entry ent; + unsigned long vaddr[]; }; #define SIZEOF_TRACE_ENTRY(is_return) \ @@ -1293,16 +1293,25 @@ static struct trace_event_functions uprobe_funcs = { .trace = print_uprobe_event }; -static int register_uprobe_event(struct trace_uprobe *tu) +static inline void init_trace_event_call(struct trace_uprobe *tu, + struct trace_event_call *call) { - struct trace_event_call *call = &tu->tp.call; - int ret; - - /* Initialize trace_event_call */ INIT_LIST_HEAD(&call->class->fields); call->event.funcs = &uprobe_funcs; call->class->define_fields = uprobe_event_define_fields; + call->flags = TRACE_EVENT_FL_UPROBE; + call->class->reg = trace_uprobe_register; + call->data = tu; +} + +static int register_uprobe_event(struct trace_uprobe *tu) +{ + struct trace_event_call *call = &tu->tp.call; + int ret = 0; + + init_trace_event_call(tu, call); + if (set_print_fmt(&tu->tp, is_ret_probe(tu)) < 0) return -ENOMEM; @@ -1312,9 +1321,6 @@ static int register_uprobe_event(struct trace_uprobe *tu) return -ENODEV; } - call->flags = TRACE_EVENT_FL_UPROBE; - call->class->reg = trace_uprobe_register; - call->data = tu; ret = trace_add_event_call(call); if (ret) { @@ -1340,6 +1346,70 @@ static int unregister_uprobe_event(struct trace_uprobe *tu) return 0; } +#ifdef CONFIG_PERF_EVENTS +struct trace_event_call * +create_local_trace_uprobe(char *name, unsigned long offs, bool is_return) +{ + struct trace_uprobe *tu; + struct inode *inode; + struct path path; + int ret; + + ret = kern_path(name, LOOKUP_FOLLOW, &path); + if (ret) + return ERR_PTR(ret); + + inode = igrab(d_inode(path.dentry)); + path_put(&path); + + if (!inode || !S_ISREG(inode->i_mode)) { + iput(inode); + return ERR_PTR(-EINVAL); + } + + /* + * local trace_kprobes are not added to probe_list, so they are never + * searched in find_trace_kprobe(). Therefore, there is no concern of + * duplicated name "DUMMY_EVENT" here. + */ + tu = alloc_trace_uprobe(UPROBE_EVENT_SYSTEM, "DUMMY_EVENT", 0, + is_return); + + if (IS_ERR(tu)) { + pr_info("Failed to allocate trace_uprobe.(%d)\n", + (int)PTR_ERR(tu)); + return ERR_CAST(tu); + } + + tu->offset = offs; + tu->inode = inode; + tu->filename = kstrdup(name, GFP_KERNEL); + init_trace_event_call(tu, &tu->tp.call); + + if (set_print_fmt(&tu->tp, is_ret_probe(tu)) < 0) { + ret = -ENOMEM; + goto error; + } + + return &tu->tp.call; +error: + free_trace_uprobe(tu); + return ERR_PTR(ret); +} + +void destroy_local_trace_uprobe(struct trace_event_call *event_call) +{ + struct trace_uprobe *tu; + + tu = container_of(event_call, struct trace_uprobe, tp.call); + + kfree(tu->tp.call.print_fmt); + tu->tp.call.print_fmt = NULL; + + free_trace_uprobe(tu); +} +#endif /* CONFIG_PERF_EVENTS */ + /* Make a trace interface for controling probe points */ static __init int init_uprobe_trace(void) { From patchwork Wed Nov 15 17:23:38 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Liu X-Patchwork-Id: 838269 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=fb.com header.i=@fb.com header.b="NIoAxWLW"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3ycWVb3bsSz9sDB for ; Thu, 16 Nov 2017 04:25:43 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758542AbdKORZb (ORCPT ); Wed, 15 Nov 2017 12:25:31 -0500 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:57474 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758089AbdKORXu (ORCPT ); Wed, 15 Nov 2017 12:23:50 -0500 Received: from pps.filterd (m0109333.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id vAFHKEtm032476 for ; Wed, 15 Nov 2017 09:23:50 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type; s=facebook; bh=d/UDGf9LjNeCaRLfsXzjQ+T9fuOy5TknawCUA2LcM10=; b=NIoAxWLWflVJ1mpmc5itxfCo7RnQ+EGcW4WGpmMQc+rvTTY8Mvki8Pvbrc36+346g8RN EnqU5wldzyBmrir3xppp2XEkWXB/fa+vfeZIHcp2iFL43CzXm0q+oc+3BZ4wnZu65dN8 Mo8oCkH6qpK18LB1asO28Gfal0hOsvheHis= Received: from mail.thefacebook.com ([199.201.64.23]) by mx0a-00082601.pphosted.com with ESMTP id 2e8rsx890m-6 (version=TLSv1 cipher=ECDHE-RSA-AES256-SHA bits=256 verify=NOT) for ; Wed, 15 Nov 2017 09:23:50 -0800 Received: from mx-out.facebook.com (192.168.52.123) by PRN-CHUB08.TheFacebook.com (192.168.16.18) with Microsoft SMTP Server id 14.3.361.1; Wed, 15 Nov 2017 09:23:47 -0800 Received: by devbig102.frc2.facebook.com (Postfix, from userid 4523) id D3AC742824EB; Wed, 15 Nov 2017 09:23:46 -0800 (PST) Smtp-Origin-Hostprefix: devbig From: Song Liu Smtp-Origin-Hostname: devbig102.frc2.facebook.com To: , , , , , , CC: , Song Liu Smtp-Origin-Cluster: frc2c02 Subject: [PATCH 5/6] bpf: add option for bpf_load.c to use PERF_TYPE_PROBE Date: Wed, 15 Nov 2017 09:23:38 -0800 Message-ID: <20171115172339.1791161-8-songliubraving@fb.com> X-Mailer: git-send-email 2.9.5 In-Reply-To: <20171115172339.1791161-1-songliubraving@fb.com> References: <20171115172339.1791161-1-songliubraving@fb.com> X-FB-Internal: Safe MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-11-15_09:, , signatures=0 X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Function load_and_attach() is updated to be able to create kprobes with either old text based API, or the new PERF_TYPE_PROBE API. A global flag use_perf_type_probe is added to select between the two APIs. Signed-off-by: Song Liu Reviewed-by: Josef Bacik --- samples/bpf/bpf_load.c | 56 ++++++++++++++++++++++++++++++++------------------ samples/bpf/bpf_load.h | 8 ++++++++ 2 files changed, 44 insertions(+), 20 deletions(-) diff --git a/samples/bpf/bpf_load.c b/samples/bpf/bpf_load.c index 2325d7a..dc6d843 100644 --- a/samples/bpf/bpf_load.c +++ b/samples/bpf/bpf_load.c @@ -8,7 +8,6 @@ #include #include #include -#include #include #include #include @@ -42,6 +41,7 @@ int prog_array_fd = -1; struct bpf_map_data map_data[MAX_MAPS]; int map_data_count = 0; +bool use_perf_type_probe = true; static int populate_prog_array(const char *event, int prog_fd) { @@ -70,8 +70,9 @@ static int load_and_attach(const char *event, struct bpf_insn *prog, int size) size_t insns_cnt = size / sizeof(struct bpf_insn); enum bpf_prog_type prog_type; char buf[256]; - int fd, efd, err, id; + int fd, efd, err, id = -1; struct perf_event_attr attr = {}; + struct probe_desc pd; attr.type = PERF_TYPE_TRACEPOINT; attr.sample_type = PERF_SAMPLE_RAW; @@ -128,7 +129,7 @@ static int load_and_attach(const char *event, struct bpf_insn *prog, int size) return populate_prog_array(event, fd); } - if (is_kprobe || is_kretprobe) { + if (!use_perf_type_probe && (is_kprobe || is_kretprobe)) { if (is_kprobe) event += 7; else @@ -169,27 +170,42 @@ static int load_and_attach(const char *event, struct bpf_insn *prog, int size) strcat(buf, "/id"); } - efd = open(buf, O_RDONLY, 0); - if (efd < 0) { - printf("failed to open event %s\n", event); - return -1; - } - - err = read(efd, buf, sizeof(buf)); - if (err < 0 || err >= sizeof(buf)) { - printf("read from '%s' failed '%s'\n", event, strerror(errno)); - return -1; + if (use_perf_type_probe && (is_kprobe || is_kretprobe)) { + attr.type = PERF_TYPE_PROBE; + pd.func = ptr_to_u64(event + strlen(is_kprobe ? "kprobe/" + : "kretprobe/")); + pd.offset = 0; + attr.is_return = !!is_kretprobe; + attr.probe_desc = ptr_to_u64(&pd); + } else { + efd = open(buf, O_RDONLY, 0); + if (efd < 0) { + printf("failed to open event %s\n", event); + return -1; + } + err = read(efd, buf, sizeof(buf)); + if (err < 0 || err >= sizeof(buf)) { + printf("read from '%s' failed '%s'\n", event, + strerror(errno)); + return -1; + } + close(efd); + buf[err] = 0; + id = atoi(buf); + attr.config = id; } - close(efd); - - buf[err] = 0; - id = atoi(buf); - attr.config = id; - efd = sys_perf_event_open(&attr, -1/*pid*/, 0/*cpu*/, -1/*group_fd*/, 0); if (efd < 0) { - printf("event %d fd %d err %s\n", id, efd, strerror(errno)); + if (use_perf_type_probe && (is_kprobe || is_kretprobe)) + printf("k%sprobe %s fd %d err %s\n", + is_kprobe ? "" : "ret", + event + strlen(is_kprobe ? "kprobe/" + : "kretprobe/"), + efd, strerror(errno)); + else + printf("event %d fd %d err %s\n", id, efd, + strerror(errno)); return -1; } event_fd[prog_cnt - 1] = efd; diff --git a/samples/bpf/bpf_load.h b/samples/bpf/bpf_load.h index 7d57a42..e7a8a21 100644 --- a/samples/bpf/bpf_load.h +++ b/samples/bpf/bpf_load.h @@ -2,6 +2,7 @@ #ifndef __BPF_LOAD_H #define __BPF_LOAD_H +#include #include "libbpf.h" #define MAX_MAPS 32 @@ -38,6 +39,8 @@ extern int map_fd[MAX_MAPS]; extern struct bpf_map_data map_data[MAX_MAPS]; extern int map_data_count; +extern bool use_perf_type_probe; + /* parses elf file compiled by llvm .c->.o * . parses 'maps' section and creates maps via BPF syscall * . parses 'license' section and passes it to syscall @@ -59,6 +62,11 @@ struct ksym { char *name; }; +static inline __u64 ptr_to_u64(const void *ptr) +{ + return (__u64) (unsigned long) ptr; +} + int load_kallsyms(void); struct ksym *ksym_search(long key); int set_link_xdp_fd(int ifindex, int fd, __u32 flags); From patchwork Wed Nov 15 17:23:39 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Liu X-Patchwork-Id: 838271 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=fb.com header.i=@fb.com header.b="cRfMye8s"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3ycWXB6mQ6z9sDB for ; Thu, 16 Nov 2017 04:27:06 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758285AbdKORZA (ORCPT ); Wed, 15 Nov 2017 12:25:00 -0500 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:45856 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757965AbdKORYG (ORCPT ); Wed, 15 Nov 2017 12:24:06 -0500 Received: from pps.filterd (m0044008.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id vAFHNvLo026868 for ; Wed, 15 Nov 2017 09:24:05 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type; s=facebook; bh=ZmIifB7By6vBUjCujQpMGjtkfOXyXhM/tJtcLScYolY=; b=cRfMye8sfWfxGQ1IEn0G/tUVpuC2tuVryFjwch/R58KpSd2WXPoCMcouLLEPW9P9Z5tf lwwE9TUM/coFzeq/Ve7i9zd7ecqVv+quMU083Ev7lGl9bOccI3zVp0jvX3eo5EpRecSA wwEwoqSS3iHN+0rnTqDIxT3sIp6KMOYT8jA= Received: from mail.thefacebook.com ([199.201.64.23]) by mx0a-00082601.pphosted.com with ESMTP id 2e8rh6gb0x-12 (version=TLSv1 cipher=ECDHE-RSA-AES256-SHA bits=256 verify=NOT) for ; Wed, 15 Nov 2017 09:24:05 -0800 Received: from mx-out.facebook.com (192.168.52.123) by PRN-CHUB06.TheFacebook.com (192.168.16.16) with Microsoft SMTP Server id 14.3.361.1; Wed, 15 Nov 2017 09:23:47 -0800 Received: by devbig102.frc2.facebook.com (Postfix, from userid 4523) id EA3DA42824EB; Wed, 15 Nov 2017 09:23:46 -0800 (PST) Smtp-Origin-Hostprefix: devbig From: Song Liu Smtp-Origin-Hostname: devbig102.frc2.facebook.com To: , , , , , , CC: , Song Liu Smtp-Origin-Cluster: frc2c02 Subject: [PATCH 6/6] bpf: add new test test_many_kprobe Date: Wed, 15 Nov 2017 09:23:39 -0800 Message-ID: <20171115172339.1791161-9-songliubraving@fb.com> X-Mailer: git-send-email 2.9.5 In-Reply-To: <20171115172339.1791161-1-songliubraving@fb.com> References: <20171115172339.1791161-1-songliubraving@fb.com> X-FB-Internal: Safe MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-11-15_09:, , signatures=0 X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The test compares old text based kprobe API with PERF_TYPE_PROBE. Here is a sample output of this test: Creating 1000 kprobes with text-based API takes 6.979683 seconds Cleaning 1000 kprobes with text-based API takes 84.897687 seconds Creating 1000 kprobes with PERF_TYPE_PROBE (function name) takes 5.077558 seconds Cleaning 1000 kprobes with PERF_TYPE_PROBE (function name) takes 81.241354 seconds Creating 1000 kprobes with PERF_TYPE_PROBE (function addr) takes 5.218255 seconds Cleaning 1000 kprobes with PERF_TYPE_PROBE (function addr) takes 80.010731 seconds Signed-off-by: Song Liu Reviewed-by: Josef Bacik --- samples/bpf/Makefile | 3 + samples/bpf/bpf_load.c | 5 +- samples/bpf/bpf_load.h | 4 + samples/bpf/test_many_kprobe_user.c | 184 ++++++++++++++++++++++++++++++++++++ 4 files changed, 193 insertions(+), 3 deletions(-) create mode 100644 samples/bpf/test_many_kprobe_user.c diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile index 9b4a66e..ec92f35 100644 --- a/samples/bpf/Makefile +++ b/samples/bpf/Makefile @@ -42,6 +42,7 @@ hostprogs-y += xdp_redirect hostprogs-y += xdp_redirect_map hostprogs-y += xdp_monitor hostprogs-y += syscall_tp +hostprogs-y += test_many_kprobe # Libbpf dependencies LIBBPF := ../../tools/lib/bpf/bpf.o @@ -87,6 +88,7 @@ xdp_redirect-objs := bpf_load.o $(LIBBPF) xdp_redirect_user.o xdp_redirect_map-objs := bpf_load.o $(LIBBPF) xdp_redirect_map_user.o xdp_monitor-objs := bpf_load.o $(LIBBPF) xdp_monitor_user.o syscall_tp-objs := bpf_load.o $(LIBBPF) syscall_tp_user.o +test_many_kprobe-objs := bpf_load.o $(LIBBPF) test_many_kprobe_user.o # Tell kbuild to always build the programs always := $(hostprogs-y) @@ -172,6 +174,7 @@ HOSTLOADLIBES_xdp_redirect += -lelf HOSTLOADLIBES_xdp_redirect_map += -lelf HOSTLOADLIBES_xdp_monitor += -lelf HOSTLOADLIBES_syscall_tp += -lelf +HOSTLOADLIBES_test_many_kprobe += -lelf # Allows pointing LLC/CLANG to a LLVM backend with bpf support, redefine on cmdline: # make samples/bpf/ LLC=~/git/llvm/build/bin/llc CLANG=~/git/llvm/build/bin/clang diff --git a/samples/bpf/bpf_load.c b/samples/bpf/bpf_load.c index dc6d843..ab514c1 100644 --- a/samples/bpf/bpf_load.c +++ b/samples/bpf/bpf_load.c @@ -637,9 +637,8 @@ void read_trace_pipe(void) } } -#define MAX_SYMS 300000 -static struct ksym syms[MAX_SYMS]; -static int sym_cnt; +struct ksym syms[MAX_SYMS]; +int sym_cnt; static int ksym_cmp(const void *p1, const void *p2) { diff --git a/samples/bpf/bpf_load.h b/samples/bpf/bpf_load.h index e7a8a21..16bc263 100644 --- a/samples/bpf/bpf_load.h +++ b/samples/bpf/bpf_load.h @@ -67,6 +67,10 @@ static inline __u64 ptr_to_u64(const void *ptr) return (__u64) (unsigned long) ptr; } +#define MAX_SYMS 300000 +extern struct ksym syms[MAX_SYMS]; +extern int sym_cnt; + int load_kallsyms(void); struct ksym *ksym_search(long key); int set_link_xdp_fd(int ifindex, int fd, __u32 flags); diff --git a/samples/bpf/test_many_kprobe_user.c b/samples/bpf/test_many_kprobe_user.c new file mode 100644 index 0000000..70b680e --- /dev/null +++ b/samples/bpf/test_many_kprobe_user.c @@ -0,0 +1,184 @@ +/* Copyright (c) 2017 Facebook + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of version 2 of the GNU General Public + * License as published by the Free Software Foundation. + */ +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "libbpf.h" +#include "bpf_load.h" +#include "perf-sys.h" + +#define MAX_KPROBES 1000 + +#define DEBUGFS "/sys/kernel/debug/tracing/" + +int kprobes[MAX_KPROBES] = {0}; +int kprobe_count; +int perf_event_fds[MAX_KPROBES]; +const char license[] = "GPL"; + +static __u64 time_get_ns(void) +{ + struct timespec ts; + + clock_gettime(CLOCK_MONOTONIC, &ts); + return ts.tv_sec * 1000000000ull + ts.tv_nsec; +} + +static int kprobe_api(char *func, void *addr, bool use_new_api) +{ + int efd; + struct perf_event_attr attr = {}; + struct probe_desc pd; + char buf[256]; + int err, id; + + attr.sample_type = PERF_SAMPLE_RAW; + attr.sample_period = 1; + attr.wakeup_events = 1; + + if (use_new_api) { + attr.type = PERF_TYPE_PROBE; + if (func) { + pd.func = ptr_to_u64(func); + pd.offset = 0; + } else { + pd.func = 0; + pd.offset = ptr_to_u64(addr); + } + + attr.probe_desc = ptr_to_u64(&pd); + } else { + attr.type = PERF_TYPE_TRACEPOINT; + snprintf(buf, sizeof(buf), + "echo 'p:%s %s' >> /sys/kernel/debug/tracing/kprobe_events", + func, func); + err = system(buf); + if (err < 0) { + printf("failed to create kprobe '%s' error '%s'\n", + func, strerror(errno)); + return -1; + } + + strcpy(buf, DEBUGFS); + strcat(buf, "events/kprobes/"); + strcat(buf, func); + strcat(buf, "/id"); + efd = open(buf, O_RDONLY, 0); + if (efd < 0) { + printf("failed to open event %s\n", func); + return -1; + } + + err = read(efd, buf, sizeof(buf)); + if (err < 0 || err >= sizeof(buf)) { + printf("read from '%s' failed '%s'\n", func, + strerror(errno)); + return -1; + } + + close(efd); + buf[err] = 0; + id = atoi(buf); + attr.config = id; + } + + efd = sys_perf_event_open(&attr, -1/*pid*/, 0/*cpu*/, + -1/*group_fd*/, 0); + + return efd; +} + +static int select_kprobes(void) +{ + int fd; + int i; + + load_kallsyms(); + + kprobe_count = 0; + for (i = 0; i < sym_cnt; i++) { + if (strstr(syms[i].name, ".")) + continue; + fd = kprobe_api(syms[i].name, NULL, true); + if (fd < 0) + continue; + close(fd); + kprobes[kprobe_count] = i; + if (++kprobe_count >= MAX_KPROBES) + break; + } + + return 0; +} + +int main(int argc, char *argv[]) +{ + int i; + __u64 start_time; + + select_kprobes(); + + /* clean all trace_kprobe */ + i = system("echo \"\" > /sys/kernel/debug/tracing/kprobe_events"); + + /* test text based API */ + start_time = time_get_ns(); + for (i = 0; i < kprobe_count; i++) + perf_event_fds[i] = kprobe_api(syms[kprobes[i]].name, + NULL, false); + printf("Creating %d kprobes with text-based API takes %f seconds\n", + kprobe_count, (time_get_ns() - start_time) / 1000000000.0); + + start_time = time_get_ns(); + for (i = 0; i < kprobe_count; i++) + if (perf_event_fds[i] > 0) + close(perf_event_fds[i]); + i = system("echo \"\" > /sys/kernel/debug/tracing/kprobe_events"); + printf("Cleaning %d kprobes with text-based API takes %f seconds\n", + kprobe_count, (time_get_ns() - start_time) / 1000000000.0); + + /* test PERF_TYPE_PROBE API, with function names */ + start_time = time_get_ns(); + for (i = 0; i < kprobe_count; i++) + perf_event_fds[i] = kprobe_api(syms[kprobes[i]].name, + NULL, true); + printf("Creating %d kprobes with PERF_TYPE_PROBE (function name) takes %f seconds\n", + kprobe_count, (time_get_ns() - start_time) / 1000000000.0); + + start_time = time_get_ns(); + for (i = 0; i < kprobe_count; i++) + if (perf_event_fds[i] > 0) + close(perf_event_fds[i]); + printf("Cleaning %d kprobes with PERF_TYPE_PROBE (function name) takes %f seconds\n", + kprobe_count, (time_get_ns() - start_time) / 1000000000.0); + + /* test PERF_TYPE_PROBE API, with function address */ + start_time = time_get_ns(); + for (i = 0; i < kprobe_count; i++) + perf_event_fds[i] = kprobe_api( + NULL, (void *)(syms[kprobes[i]].addr), true); + printf("Creating %d kprobes with PERF_TYPE_PROBE (function addr) takes %f seconds\n", + kprobe_count, (time_get_ns() - start_time) / 1000000000.0); + + start_time = time_get_ns(); + for (i = 0; i < kprobe_count; i++) + if (perf_event_fds[i] > 0) + close(perf_event_fds[i]); + printf("Cleaning %d kprobes with PERF_TYPE_PROBE (function addr) takes %f seconds\n", + kprobe_count, (time_get_ns() - start_time) / 1000000000.0); + return 0; +}