From patchwork Wed Mar 18 13:06:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eelco Chaudron X-Patchwork-Id: 1257491 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=fXamj5On; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 48j9Km4tpmz9sNg for ; Thu, 19 Mar 2020 00:06:52 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727004AbgCRNGv (ORCPT ); Wed, 18 Mar 2020 09:06:51 -0400 Received: from us-smtp-delivery-74.mimecast.com ([216.205.24.74]:43128 "EHLO us-smtp-delivery-74.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726836AbgCRNGv (ORCPT ); Wed, 18 Mar 2020 09:06:51 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1584536810; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7Hoi5M8jrzpBz0eRm1EPg/ttJQcIxZ+2hlA7Rs3PIMQ=; b=fXamj5OnV7tJysj6xvEIB1rMMqHpRnAcfBxoRzegoPPjB0d78ciiaEJq2gUXbEGYeK3t0k 5c/XXO4mXf1TODjpYuZvTbslVzVzvP2Zul1wcdYbbk+Vk12t5SRZgD20tRec7MemwygkFB Dx+xcSRz4qZsV9jYz+HTqzuMX6+cbe4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-169-j0jJsaoRPYOnURA2S4o34Q-1; Wed, 18 Mar 2020 09:06:32 -0400 X-MC-Unique: j0jJsaoRPYOnURA2S4o34Q-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 7C6971022F24; Wed, 18 Mar 2020 13:06:14 +0000 (UTC) Received: from localhost.localdomain (wsfd-netdev76.ntdv.lab.eng.bos.redhat.com [10.19.188.157]) by smtp.corp.redhat.com (Postfix) with ESMTP id 89D099F56; Wed, 18 Mar 2020 13:06:13 +0000 (UTC) From: Eelco Chaudron To: bpf@vger.kernel.org Cc: davem@davemloft.net, netdev@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net, kafai@fb.com, songliubraving@fb.com, yhs@fb.com, andriin@fb.com Subject: [RFC PATCH bpf-next 1/3] bpf: introduce trace option to the BPF_PROG_TEST_RUN command API Date: Wed, 18 Mar 2020 13:06:11 +0000 Message-Id: <158453677145.3043.1585141550260881967.stgit@xdp-tutorial> In-Reply-To: <158453675319.3043.5779623595270458781.stgit@xdp-tutorial> References: <158453675319.3043.5779623595270458781.stgit@xdp-tutorial> User-Agent: StGit/0.19 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This patch adds a flag to the existing BPF_PROG_TEST_RUN API, which will allow tracing for XDP programs. Signed-off-by: Eelco Chaudron --- include/linux/filter.h | 13 +++++++++++++ include/uapi/linux/bpf.h | 4 ++++ kernel/bpf/syscall.c | 2 +- net/bpf/test_run.c | 17 +++++++++++------ tools/include/uapi/linux/bpf.h | 4 ++++ 5 files changed, 33 insertions(+), 7 deletions(-) diff --git a/include/linux/filter.h b/include/linux/filter.h index 43b5e455d2f5..f95f9ad45ad6 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -737,6 +737,19 @@ static __always_inline u32 bpf_prog_run_xdp(const struct bpf_prog *prog, BPF_DISPATCHER_FUNC(bpf_dispatcher_xdp)); } +static __always_inline u32 bpf_prog_run_xdp_trace(const struct bpf_prog *prog, + struct xdp_buff *xdp) +{ + /* Caller needs to hold rcu_read_lock() (!), otherwise program + * can be released while still running, or map elements could be + * freed early while still having concurrent users. XDP fastpath + * already takes rcu_read_lock() when fetching the program, so + * it's not necessary here anymore. + */ + return __BPF_PROG_RUN(prog, xdp, + BPF_DISPATCHER_FUNC(bpf_dispatcher_xdp)); +} + void bpf_prog_change_xdp(struct bpf_prog *prev_prog, struct bpf_prog *prog); static inline u32 bpf_prog_insn_size(const struct bpf_prog *prog) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 40b2d9476268..ac5c89903550 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -375,6 +375,9 @@ enum { */ #define BPF_F_QUERY_EFFECTIVE (1U << 0) +/* Flags for BPF_PROG_TEST_RUN. */ +#define BPF_F_TEST_ENABLE_TRACE (1U << 0) + enum bpf_stack_build_id_status { /* user space need an empty entry to identify end of a trace */ BPF_STACK_BUILD_ID_EMPTY = 0, @@ -511,6 +514,7 @@ union bpf_attr { */ __aligned_u64 ctx_in; __aligned_u64 ctx_out; + __u32 flags; } test; struct { /* anonymous struct used by BPF_*_GET_*_ID */ diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 7ce0815793dd..9a6fae428976 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -2680,7 +2680,7 @@ static int bpf_prog_query(const union bpf_attr *attr, return cgroup_bpf_prog_query(attr, uattr); } -#define BPF_PROG_TEST_RUN_LAST_FIELD test.ctx_out +#define BPF_PROG_TEST_RUN_LAST_FIELD test.flags static int bpf_prog_test_run(const union bpf_attr *attr, union bpf_attr __user *uattr) diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c index 4c921f5154e0..061cad840b05 100644 --- a/net/bpf/test_run.c +++ b/net/bpf/test_run.c @@ -16,7 +16,7 @@ #include static int bpf_test_run(struct bpf_prog *prog, void *ctx, u32 repeat, - u32 *retval, u32 *time, bool xdp) + u32 *retval, u32 *time, bool xdp, bool trace) { struct bpf_cgroup_storage *storage[MAX_BPF_CGROUP_STORAGE_TYPE] = { NULL }; enum bpf_cgroup_storage_type stype; @@ -43,10 +43,14 @@ static int bpf_test_run(struct bpf_prog *prog, void *ctx, u32 repeat, for (i = 0; i < repeat; i++) { bpf_cgroup_storage_set(storage); - if (xdp) - *retval = bpf_prog_run_xdp(prog, ctx); - else + if (xdp) { + if (trace) + *retval = bpf_prog_run_xdp_trace(prog, ctx); + else + *retval = bpf_prog_run_xdp(prog, ctx); + } else { *retval = BPF_PROG_RUN(prog, ctx); + } if (signal_pending(current)) { ret = -EINTR; @@ -431,7 +435,7 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr, ret = convert___skb_to_skb(skb, ctx); if (ret) goto out; - ret = bpf_test_run(prog, skb, repeat, &retval, &duration, false); + ret = bpf_test_run(prog, skb, repeat, &retval, &duration, false, false); if (ret) goto out; if (!is_l2) { @@ -468,6 +472,7 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr, { u32 size = kattr->test.data_size_in; u32 repeat = kattr->test.repeat; + bool trace = kattr->test.flags & BPF_F_TEST_ENABLE_TRACE; struct netdev_rx_queue *rxqueue; struct xdp_buff xdp = {}; u32 retval, duration; @@ -489,7 +494,7 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr, rxqueue = __netif_get_rx_queue(current->nsproxy->net_ns->loopback_dev, 0); xdp.rxq = &rxqueue->xdp_rxq; bpf_prog_change_xdp(NULL, prog); - ret = bpf_test_run(prog, &xdp, repeat, &retval, &duration, true); + ret = bpf_test_run(prog, &xdp, repeat, &retval, &duration, true, trace); if (ret) goto out; if (xdp.data != data + XDP_PACKET_HEADROOM + NET_IP_ALIGN || diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 40b2d9476268..ac5c89903550 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -375,6 +375,9 @@ enum { */ #define BPF_F_QUERY_EFFECTIVE (1U << 0) +/* Flags for BPF_PROG_TEST_RUN. */ +#define BPF_F_TEST_ENABLE_TRACE (1U << 0) + enum bpf_stack_build_id_status { /* user space need an empty entry to identify end of a trace */ BPF_STACK_BUILD_ID_EMPTY = 0, @@ -511,6 +514,7 @@ union bpf_attr { */ __aligned_u64 ctx_in; __aligned_u64 ctx_out; + __u32 flags; } test; struct { /* anonymous struct used by BPF_*_GET_*_ID */ From patchwork Wed Mar 18 13:06:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eelco Chaudron X-Patchwork-Id: 1257488 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=acSJoeQQ; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 48j9KY5r1jz9sSQ for ; Thu, 19 Mar 2020 00:06:41 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726916AbgCRNGl (ORCPT ); Wed, 18 Mar 2020 09:06:41 -0400 Received: from us-smtp-delivery-74.mimecast.com ([216.205.24.74]:24387 "EHLO us-smtp-delivery-74.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726796AbgCRNGk (ORCPT ); Wed, 18 Mar 2020 09:06:40 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1584536797; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=V52u6Yf2cF4NEbL8VRjF6jw3kIUkv312Y635wb5SVaI=; b=acSJoeQQaZnHBw6gt1hpR1kziiWCtre6P8HRTASQJXvqmDk4Yh88xul4PXDNiGXvP8Uo/p mijPOLkWSwDAYtrIjR1aQeqVIb7cdoPFmD/eQ3L4i2ziWcX4vYR71YLWG4iYJ7NniCe1q3 uhOa2IOljlY5LtXRZkl85pidSCDhWZY= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-223-OYfAjf83MJSL4_tcBHPtkw-1; Wed, 18 Mar 2020 09:06:34 -0400 X-MC-Unique: OYfAjf83MJSL4_tcBHPtkw-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id A2B17802B96; Wed, 18 Mar 2020 13:06:22 +0000 (UTC) Received: from localhost.localdomain (wsfd-netdev76.ntdv.lab.eng.bos.redhat.com [10.19.188.157]) by smtp.corp.redhat.com (Postfix) with ESMTP id 63D5E1001DC0; Wed, 18 Mar 2020 13:06:21 +0000 (UTC) From: Eelco Chaudron To: bpf@vger.kernel.org Cc: davem@davemloft.net, netdev@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net, kafai@fb.com, songliubraving@fb.com, yhs@fb.com, andriin@fb.com Subject: [RFC PATCH bpf-next 2/3] bpf: add tracing for XDP programs using the BPF_PROG_TEST_RUN API Date: Wed, 18 Mar 2020 13:06:19 +0000 Message-Id: <158453677929.3043.16929314754168577676.stgit@xdp-tutorial> In-Reply-To: <158453675319.3043.5779623595270458781.stgit@xdp-tutorial> References: <158453675319.3043.5779623595270458781.stgit@xdp-tutorial> User-Agent: StGit/0.19 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Add support for tracing eBPF XDP programs that get executed using the __BPF_PROG_RUN syscall. This is done by switching from JIT (if enabled) to executing the program using the interpreter and record each executed instruction. For now, the execution history is printed to the kernel ring buffer using pr_info(), the final version should have this stored in a user supplied buffer. Signed-off-by: Eelco Chaudron --- include/linux/filter.h | 36 ++- kernel/bpf/core.c | 679 ++++++++++++++++++++++++++++++++++++++++++++++++ kernel/bpf/verifier.c | 10 - 3 files changed, 707 insertions(+), 18 deletions(-) diff --git a/include/linux/filter.h b/include/linux/filter.h index f95f9ad45ad6..3d8fdbde2d02 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -523,6 +523,9 @@ struct bpf_binary_header { u8 image[] __aligned(BPF_IMAGE_ALIGNMENT); }; +typedef unsigned int (*bpf_func_t)(const void *ctx, + const struct bpf_insn *insn); + struct bpf_prog { u16 pages; /* Number of allocated pages */ u16 jited:1, /* Is our filter JIT'ed? */ @@ -542,8 +545,7 @@ struct bpf_prog { u8 tag[BPF_TAG_SIZE]; struct bpf_prog_aux *aux; /* Auxiliary fields */ struct sock_fprog_kern *orig_prog; /* Original BPF program */ - unsigned int (*bpf_func)(const void *ctx, - const struct bpf_insn *insn); + bpf_func_t bpf_func; /* Instructions for interpreter */ union { struct sock_filter insns[0]; @@ -559,6 +561,29 @@ struct sk_filter { DECLARE_STATIC_KEY_FALSE(bpf_stats_enabled_key); +#define __BPF_PROG_RUN_INTERPRETER_TRACE(prog, ctx, dfunc) ({ \ + u32 ret; \ + bpf_func_t bpf_func = bpf_get_trace_interpreter(prog); \ + if (!bpf_func) { \ + /* This should not happen, but if it does warn once */ \ + WARN_ON_ONCE(1); \ + return 0; \ + } \ + cant_migrate(); \ + if (static_branch_unlikely(&bpf_stats_enabled_key)) { \ + struct bpf_prog_stats *stats; \ + u64 start = sched_clock(); \ + ret = dfunc(ctx, (prog)->insnsi, bpf_func); \ + stats = this_cpu_ptr(prog->aux->stats); \ + u64_stats_update_begin(&stats->syncp); \ + stats->cnt++; \ + stats->nsecs += sched_clock() - start; \ + u64_stats_update_end(&stats->syncp); \ + } else { \ + ret = dfunc(ctx, (prog)->insnsi, bpf_func); \ + } \ + ret; }) + #define __BPF_PROG_RUN(prog, ctx, dfunc) ({ \ u32 ret; \ cant_migrate(); \ @@ -722,6 +747,8 @@ static inline u32 bpf_prog_run_clear_cb(const struct bpf_prog *prog, return res; } +bpf_func_t bpf_get_trace_interpreter(const struct bpf_prog *fp); + DECLARE_BPF_DISPATCHER(bpf_dispatcher_xdp) static __always_inline u32 bpf_prog_run_xdp(const struct bpf_prog *prog, @@ -746,8 +773,9 @@ static __always_inline u32 bpf_prog_run_xdp_trace(const struct bpf_prog *prog, * already takes rcu_read_lock() when fetching the program, so * it's not necessary here anymore. */ - return __BPF_PROG_RUN(prog, xdp, - BPF_DISPATCHER_FUNC(bpf_dispatcher_xdp)); + return __BPF_PROG_RUN_INTERPRETER_TRACE(prog, xdp, + BPF_DISPATCHER_FUNC( + bpf_dispatcher_xdp)); } void bpf_prog_change_xdp(struct bpf_prog *prev_prog, struct bpf_prog *prog); diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 973a20d49749..5bfc7b5c5af5 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -34,6 +34,8 @@ #include #include +#include "disasm.h" + /* Registers */ #define BPF_R0 regs[BPF_REG_0] #define BPF_R1 regs[BPF_REG_1] @@ -1341,13 +1343,17 @@ bool bpf_opcode_in_insntable(u8 code) return public_insntable[code]; } -#ifndef CONFIG_BPF_JIT_ALWAYS_ON u64 __weak bpf_probe_read_kernel(void *dst, u32 size, const void *unsafe_ptr) { memset(dst, 0, size); return -EFAULT; } +/* TODO: Removed "ifndef CONFIG_BPF_JIT_ALWAYS_ON" for now from __bpf_prog_run + * as we need it for fixup_call_args(), i.e. bpf 2 bpf call with the + * interpreter. + */ + /** * __bpf_prog_run - run eBPF program on a given context * @regs: is the array of MAX_BPF_EXT_REG eBPF pseudo-registers @@ -1634,6 +1640,625 @@ static u64 __no_fgcse ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u6 return 0; } +struct bpf_trace_ctx { + u32 insn_executed; + u32 stack_size; + u64 *stack; + u64 *regs; + u64 regs_stored[MAX_BPF_EXT_REG]; + void *addr_stored; + u64 addr_value_stored; + char scratch[80]; + char scratch_addr[KSYM_NAME_LEN]; + const struct bpf_insn *prog_insn; + const struct bpf_insn *prev_insn; +}; + +static __always_inline void __bpf_trace_dump_stack(struct bpf_trace_ctx *ctx) +{ + int i; + + pr_info("BPFTrace: STACK(%u bytes):", ctx->stack_size); + + for (i = 0; i < (ctx->stack_size / sizeof(u64) / 4); i++) { + pr_info("BPFTrace: %px: 0x%016llx %016llx %016llx %016llx", + &ctx->stack[i * 4], + ctx->stack[i * 4 + 0], ctx->stack[i * 4 + 1], + ctx->stack[i * 4 + 2], ctx->stack[i * 4 + 3]); + } + + if ((ctx->stack_size / sizeof(u64) % 4) > 0) { + int j; + char line[96]; + int bytes_used = 0; + + bytes_used += snprintf(line, sizeof(line), + "BPFTrace: %px: 0x", + &ctx->stack[i * 4]); + + for (j = 0; j < (ctx->stack_size / sizeof(u64) % 4); j++) { + bytes_used += snprintf(line + bytes_used, + sizeof(line) - bytes_used, + "%016llx ", + ctx->stack[i * 4 + j]); + } + pr_info("%s", line); + } +} + +static __always_inline void __bpf_trace_dump_regs(struct bpf_trace_ctx *ctx) +{ + pr_info("BPFTrace: REGISTERS:"); + pr_info("BPFTrace: r00-03: 0x%016llx %016llx %016llx %016llx", + ctx->regs[BPF_REG_0], ctx->regs[BPF_REG_1], + ctx->regs[BPF_REG_2], ctx->regs[BPF_REG_3]); + pr_info("BPFTrace: r04-07: 0x%016llx %016llx %016llx %016llx", + ctx->regs[BPF_REG_4], ctx->regs[BPF_REG_5], + ctx->regs[BPF_REG_6], ctx->regs[BPF_REG_7]); + pr_info("BPFTrace: r08-11: 0x%016llx %016llx %016llx %016llx", + ctx->regs[BPF_REG_8], ctx->regs[BPF_REG_9], + ctx->regs[BPF_REG_10], ctx->regs[BPF_REG_AX]); +} + +static __always_inline void __bpf_trace_init(struct bpf_trace_ctx *ctx, + const struct bpf_insn *insn, + u64 *regs, + u64 *stack, u32 stack_size) +{ + memset(ctx, 0, sizeof(*ctx)); + ctx->prog_insn = insn; + ctx->regs = regs; + ctx->stack = stack; + ctx->stack_size = stack_size; + + __bpf_trace_dump_stack(ctx); + pr_info("BPFTrace:"); + __bpf_trace_dump_regs(ctx); + pr_info("BPFTrace:"); +} + +static const char *__bpf_cb_call(void *private_data, + const struct bpf_insn *insn) +{ + /* For now only return helper names, we can fix this in the + * actual tool decoding the trace buffer, i.e. bpftool + */ + struct bpf_trace_ctx *ctx = private_data; + + if (insn->src_reg == BPF_PSEUDO_CALL) { + strncpy(ctx->scratch_addr, "unknown", + sizeof(ctx->scratch_addr)); + + } else { + u64 address; + + if (__bpf_call_base + insn->imm) + address = (u64) __bpf_call_base + insn->imm; + else + address = (u64) __bpf_call_base_args + insn->imm; + + + if (!kallsyms_lookup(address, NULL, NULL, NULL, + ctx->scratch_addr)) { + + snprintf(ctx->scratch_addr, sizeof(ctx->scratch_addr), + "%#016llx", address); + } + } + return ctx->scratch_addr; +} + + +static void __bpf_cb_print(void *private_data, const char *fmt, ...) +{ + struct bpf_trace_ctx *ctx = private_data; + va_list args; + + va_start(args, fmt); + vsnprintf(ctx->scratch, sizeof(ctx->scratch), fmt, args); + va_end(args); +} + +static __always_inline void __bpf_trace_next(struct bpf_trace_ctx *ctx, + const struct bpf_insn *insn) +{ + int reg; + + /* For tracing handle results of previous function */ + if (ctx->prev_insn != NULL) { + u64 pc; + const struct bpf_insn *prev_insn = ctx->prev_insn; + const struct bpf_insn_cbs cbs = { + .cb_call = __bpf_cb_call, + .cb_print = __bpf_cb_print, + .private_data = ctx, + }; + ctx->scratch[0] = 0; + /* + * TODO: Here we need to handle the BPF_CALL_ARGS and + * BPF_TAIL_CALL for the disassemble part, see also the + * bpf_insn_prepare_dump() call. + */ + print_bpf_insn(&cbs, prev_insn, true); + ctx->scratch[strlen(ctx->scratch) - 1] = 0; + + /* TODO: We need a better way to calculate the pc, as the pc + * could be out of scope, i.e. external call. We could + * use bpf_prog_insn_size() if we have the bpf_prog + * structure. + */ + pc = (u64) (prev_insn - ctx->prog_insn); + + pr_info("BPFTrace: %u[%#llx]: %s", + ctx->insn_executed, pc, + ctx->scratch); + + switch (BPF_CLASS(prev_insn->code)) { + case BPF_ALU: + case BPF_ALU64: + case BPF_LD: + case BPF_LDX: + pr_info("BPFTrace: r%u: %#018llx -> %#018llx", + ctx->prev_insn->dst_reg, + ctx->regs_stored[prev_insn->dst_reg], + ctx->regs[prev_insn->dst_reg]); + break; + + case BPF_ST: + case BPF_STX: + switch (BPF_SIZE(insn->code)) { + case BPF_B: + pr_info("BPFTrace: MEM[%px]: %#04llx -> %#04x", + ctx->addr_stored, + ctx->addr_value_stored, + *(u8 *) ctx->addr_stored); + break; + case BPF_H: + pr_info("BPFTrace: MEM[%px]: %#06llx -> %#06x", + ctx->addr_stored, + ctx->addr_value_stored, + *(u16 *) ctx->addr_stored); + break; + case BPF_W: + pr_info("BPFTrace: MEM[%px]: %#010llx -> %#010x", + ctx->addr_stored, + ctx->addr_value_stored, + *(u32 *) ctx->addr_stored); + break; + case BPF_DW: + pr_info("BPFTrace: MEM[%px]: %#018llx -> %#018llx", + ctx->addr_stored, + ctx->addr_value_stored, + *(u64 *) ctx->addr_stored); + break; + } + if (BPF_OP(prev_insn->code) == BPF_XADD) + pr_info("BPFTrace: !! Above values are not retrieved atomically !!"); + + break; + + case BPF_JMP: + case BPF_JMP32: + switch (BPF_OP(prev_insn->code)) { + + default: + if (BPF_SRC(prev_insn->code) == BPF_X) + pr_warn("BPFTrace: r%u: %#018llx <==> r%u: %#018llx", + prev_insn->dst_reg, + ctx->regs[prev_insn->dst_reg], + prev_insn->src_reg, + ctx->regs[prev_insn->src_reg]); + else + pr_warn("BPFTrace: r%u: %#018llx <==> %#08x", + prev_insn->dst_reg, + ctx->regs[prev_insn->dst_reg], + prev_insn->imm); + fallthrough; + + case BPF_JA: + if ((prev_insn + 1) == insn) { + pr_warn("BPFTrace: branch was NOT taken"); + } else { + /* TODO: See note above on pc. */ + pr_warn("BPFTrace: pc: %#018llx -> %#018llx", + (u64) (prev_insn - ctx->prog_insn), + (u64) (insn - ctx->prog_insn)); + pr_warn("BPFTrace: branch was taken"); + } + break; + + case BPF_EXIT: + break; + + case BPF_CALL: + case BPF_CALL_ARGS: + case BPF_TAIL_CALL: + for (reg = BPF_REG_0; reg <= BPF_REG_5; + reg++) { + if (ctx->regs_stored[reg] != ctx->regs[reg]) + pr_info("BPFTrace: r%u: %#018llx -> %#018llx", + reg, + ctx->regs_stored[reg], + ctx->regs[reg]); + } + break; + } + break; + + default: + pr_warn("BPFTrace: No decode for results %d", + BPF_CLASS(prev_insn->code)); + break; + } + } + + if (insn == NULL) + return; + + /* For tracing store information needed before it gets changed */ + switch (BPF_CLASS(insn->code)) { + case BPF_ALU: + case BPF_ALU64: + case BPF_LD: + case BPF_LDX: + ctx->regs_stored[insn->dst_reg] = ctx->regs[insn->dst_reg]; + break; + + case BPF_ST: + case BPF_STX: + ctx->addr_stored = (void *) ctx->regs[insn->dst_reg] + insn->off; + switch (BPF_SIZE(insn->code)) { + case BPF_B: + ctx->addr_value_stored = *(u8 *) ctx->addr_stored; + break; + case BPF_H: + ctx->addr_value_stored = *(u16 *) ctx->addr_stored; + break; + case BPF_W: + ctx->addr_value_stored = *(u32 *) ctx->addr_stored; + break; + case BPF_DW: + ctx->addr_value_stored = *(u64 *) ctx->addr_stored; + break; + default: + ctx->addr_value_stored = 0; + break; + } + break; + + case BPF_JMP32: + /* Nothing we need to store for this case */ + break; + + case BPF_JMP: + /* Only for special jump we need to store stuff */ + if (BPF_OP(insn->code) == BPF_CALL || + BPF_OP(insn->code) == BPF_CALL_ARGS) { + + for (reg = BPF_REG_0; reg <= BPF_REG_5; reg++) + ctx->regs_stored[reg] = ctx->regs[reg]; + + } else if (BPF_OP(insn->code) == BPF_TAIL_CALL) { + + pr_warn("BPFTrace: TODO: No decode for results %d", + BPF_OP(insn->code)); + } + break; + + default: + pr_warn("BPFTrace: No capture for ORIGINAL value for %d", + BPF_CLASS(insn->code)); + break; + } + ctx->insn_executed++; + ctx->prev_insn = insn; +} + +static __always_inline void __bpf_trace_complete(struct bpf_trace_ctx *ctx, + u64 ret) +{ + __bpf_trace_next(ctx, NULL); + pr_info("BPFTrace:"); + __bpf_trace_dump_stack(ctx); + pr_info("BPFTrace:"); + __bpf_trace_dump_regs(ctx); + pr_info("BPFTrace:"); + pr_info("BPFTrace: RESULT:"); + pr_info("BPFTrace: return = %#018llx", ret); +} + +/** + * __bpf_prog_run_trace - run eBPF program on a given context + * @regs: is the array of MAX_BPF_EXT_REG eBPF pseudo-registers + * @insn: is the array of eBPF instructions + * @stack: is the eBPF storage stack + * + * Decode and execute eBPF instructions with tracing enabled. + * + * TODO: This is just a copy of the none trace __bpf_prog_run(), we need to + * integrate the two with out adding duplicate code... + */ +static u64 __no_fgcse ___bpf_prog_run_trace(struct bpf_trace_ctx *trace_ctx, + u64 *regs, + const struct bpf_insn *insn, + u64 *stack) +{ +#define BPF_INSN_2_LBL(x, y) [BPF_##x | BPF_##y] = &&x##_##y +#define BPF_INSN_3_LBL(x, y, z) [BPF_##x | BPF_##y | BPF_##z] = &&x##_##y##_##z + static const void * const jumptable[256] __annotate_jump_table = { + [0 ... 255] = &&default_label, + /* Now overwrite non-defaults ... */ + BPF_INSN_MAP(BPF_INSN_2_LBL, BPF_INSN_3_LBL), + /* Non-UAPI available opcodes. */ + [BPF_JMP | BPF_CALL_ARGS] = &&JMP_CALL_ARGS, + [BPF_JMP | BPF_TAIL_CALL] = &&JMP_TAIL_CALL, + [BPF_LDX | BPF_PROBE_MEM | BPF_B] = &&LDX_PROBE_MEM_B, + [BPF_LDX | BPF_PROBE_MEM | BPF_H] = &&LDX_PROBE_MEM_H, + [BPF_LDX | BPF_PROBE_MEM | BPF_W] = &&LDX_PROBE_MEM_W, + [BPF_LDX | BPF_PROBE_MEM | BPF_DW] = &&LDX_PROBE_MEM_DW, + }; +#undef BPF_INSN_3_LBL +#undef BPF_INSN_2_LBL + u32 tail_call_cnt = 0; + +#define CONT ({ insn++; goto select_insn; }) +#define CONT_JMP ({ insn++; goto select_insn; }) +select_insn: + __bpf_trace_next(trace_ctx, insn); + goto *jumptable[insn->code]; + + /* ALU */ +#define ALU(OPCODE, OP) \ + ALU64_##OPCODE##_X: \ + DST = DST OP SRC; \ + CONT; \ + ALU_##OPCODE##_X: \ + DST = (u32) DST OP (u32) SRC; \ + CONT; \ + ALU64_##OPCODE##_K: \ + DST = DST OP IMM; \ + CONT; \ + ALU_##OPCODE##_K: \ + DST = (u32) DST OP (u32) IMM; \ + CONT; + + ALU(ADD, +) + ALU(SUB, -) + ALU(AND, &) + ALU(OR, |) + ALU(LSH, <<) + ALU(RSH, >>) + ALU(XOR, ^) + ALU(MUL, *) +#undef ALU + ALU_NEG: + DST = (u32) -DST; + CONT; + ALU64_NEG: + DST = -DST; + CONT; + ALU_MOV_X: + DST = (u32) SRC; + CONT; + ALU_MOV_K: + DST = (u32) IMM; + CONT; + ALU64_MOV_X: + DST = SRC; + CONT; + ALU64_MOV_K: + DST = IMM; + CONT; + LD_IMM_DW: + DST = (u64) (u32) insn[0].imm | ((u64) (u32) insn[1].imm) << 32; + insn++; + CONT; + ALU_ARSH_X: + DST = (u64) (u32) (((s32) DST) >> SRC); + CONT; + ALU_ARSH_K: + DST = (u64) (u32) (((s32) DST) >> IMM); + CONT; + ALU64_ARSH_X: + (*(s64 *) &DST) >>= SRC; + CONT; + ALU64_ARSH_K: + (*(s64 *) &DST) >>= IMM; + CONT; + ALU64_MOD_X: + div64_u64_rem(DST, SRC, &AX); + DST = AX; + CONT; + ALU_MOD_X: + AX = (u32) DST; + DST = do_div(AX, (u32) SRC); + CONT; + ALU64_MOD_K: + div64_u64_rem(DST, IMM, &AX); + DST = AX; + CONT; + ALU_MOD_K: + AX = (u32) DST; + DST = do_div(AX, (u32) IMM); + CONT; + ALU64_DIV_X: + DST = div64_u64(DST, SRC); + CONT; + ALU_DIV_X: + AX = (u32) DST; + do_div(AX, (u32) SRC); + DST = (u32) AX; + CONT; + ALU64_DIV_K: + DST = div64_u64(DST, IMM); + CONT; + ALU_DIV_K: + AX = (u32) DST; + do_div(AX, (u32) IMM); + DST = (u32) AX; + CONT; + ALU_END_TO_BE: + switch (IMM) { + case 16: + DST = (__force u16) cpu_to_be16(DST); + break; + case 32: + DST = (__force u32) cpu_to_be32(DST); + break; + case 64: + DST = (__force u64) cpu_to_be64(DST); + break; + } + CONT; + ALU_END_TO_LE: + switch (IMM) { + case 16: + DST = (__force u16) cpu_to_le16(DST); + break; + case 32: + DST = (__force u32) cpu_to_le32(DST); + break; + case 64: + DST = (__force u64) cpu_to_le64(DST); + break; + } + CONT; + + /* CALL */ + JMP_CALL: + /* Function call scratches BPF_R1-BPF_R5 registers, + * preserves BPF_R6-BPF_R9, and stores return value + * into BPF_R0. + */ + BPF_R0 = (__bpf_call_base + insn->imm)(BPF_R1, BPF_R2, BPF_R3, + BPF_R4, BPF_R5); + CONT; + + JMP_CALL_ARGS: + BPF_R0 = (__bpf_call_base_args + insn->imm)(BPF_R1, BPF_R2, + BPF_R3, BPF_R4, + BPF_R5, + insn + insn->off + 1); + CONT; + + JMP_TAIL_CALL: { + struct bpf_map *map = (struct bpf_map *) (unsigned long) BPF_R2; + struct bpf_array *array = container_of(map, struct bpf_array, map); + struct bpf_prog *prog; + u32 index = BPF_R3; + + if (unlikely(index >= array->map.max_entries)) + goto out; + if (unlikely(tail_call_cnt > MAX_TAIL_CALL_CNT)) + goto out; + + tail_call_cnt++; + + prog = READ_ONCE(array->ptrs[index]); + if (!prog) + goto out; + + /* ARG1 at this point is guaranteed to point to CTX from + * the verifier side due to the fact that the tail call is + * handeled like a helper, that is, bpf_tail_call_proto, + * where arg1_type is ARG_PTR_TO_CTX. + */ + insn = prog->insnsi; + goto select_insn; +out: + CONT; + } + JMP_JA: + insn += insn->off; + CONT; + JMP_EXIT: + return BPF_R0; + /* JMP */ +#define COND_JMP(SIGN, OPCODE, CMP_OP) \ + JMP_##OPCODE##_X: \ + if ((SIGN##64) DST CMP_OP (SIGN##64) SRC) { \ + insn += insn->off; \ + CONT_JMP; \ + } \ + CONT; \ + JMP32_##OPCODE##_X: \ + if ((SIGN##32) DST CMP_OP (SIGN##32) SRC) { \ + insn += insn->off; \ + CONT_JMP; \ + } \ + CONT; \ + JMP_##OPCODE##_K: \ + if ((SIGN##64) DST CMP_OP (SIGN##64) IMM) { \ + insn += insn->off; \ + CONT_JMP; \ + } \ + CONT; \ + JMP32_##OPCODE##_K: \ + if ((SIGN##32) DST CMP_OP (SIGN##32) IMM) { \ + insn += insn->off; \ + CONT_JMP; \ + } \ + CONT; + COND_JMP(u, JEQ, ==) + COND_JMP(u, JNE, !=) + COND_JMP(u, JGT, >) + COND_JMP(u, JLT, <) + COND_JMP(u, JGE, >=) + COND_JMP(u, JLE, <=) + COND_JMP(u, JSET, &) + COND_JMP(s, JSGT, >) + COND_JMP(s, JSLT, <) + COND_JMP(s, JSGE, >=) + COND_JMP(s, JSLE, <=) +#undef COND_JMP + /* STX and ST and LDX*/ +#define LDST(SIZEOP, SIZE) \ + STX_MEM_##SIZEOP: \ + *(SIZE *)(unsigned long) (DST + insn->off) = SRC; \ + CONT; \ + ST_MEM_##SIZEOP: \ + *(SIZE *)(unsigned long) (DST + insn->off) = IMM; \ + CONT; \ + LDX_MEM_##SIZEOP: \ + DST = *(SIZE *)(unsigned long) (SRC + insn->off); \ + CONT; + + LDST(B, u8) + LDST(H, u16) + LDST(W, u32) + LDST(DW, u64) +#undef LDST +#define LDX_PROBE(SIZEOP, SIZE) \ + LDX_PROBE_MEM_##SIZEOP: \ + bpf_probe_read_kernel(&DST, SIZE, (const void *)(long) (SRC + insn->off)); \ + CONT; + LDX_PROBE(B, 1) + LDX_PROBE(H, 2) + LDX_PROBE(W, 4) + LDX_PROBE(DW, 8) +#undef LDX_PROBE + + STX_XADD_W: /* lock xadd *(u32 *)(dst_reg + off16) += src_reg */ + atomic_add((u32) SRC, (atomic_t *)(unsigned long) + (DST + insn->off)); + CONT; + STX_XADD_DW: /* lock xadd *(u64 *)(dst_reg + off16) += src_reg */ + atomic64_add((u64) SRC, (atomic64_t *)(unsigned long) + (DST + insn->off)); + CONT; + + default_label: + /* If we ever reach this, we have a bug somewhere. Die hard here + * instead of just returning 0; we could be somewhere in a subprog, + * so execution could continue otherwise which we do /not/ want. + * + * Note, verifier whitelists all opcodes in bpf_opcode_in_insntable(). + */ + pr_warn("BPF interpreter: unknown opcode %02x\n", insn->code); + BUG_ON(1); + return 0; +} + #define PROG_NAME(stack_size) __bpf_prog_run##stack_size #define DEFINE_BPF_PROG_RUN(stack_size) \ static unsigned int PROG_NAME(stack_size)(const void *ctx, const struct bpf_insn *insn) \ @@ -1646,6 +2271,27 @@ static unsigned int PROG_NAME(stack_size)(const void *ctx, const struct bpf_insn return ___bpf_prog_run(regs, insn, stack); \ } +#define PROG_NAME_TRACE(stack_size) __bpf_prog_run_trace##stack_size +#define DEFINE_BPF_PROG_RUN_TRACE(stack_size) \ +static unsigned int PROG_NAME_TRACE(stack_size)(const void *ctx, const struct bpf_insn *insn) \ +{ \ + unsigned int ret; \ + u64 stack[stack_size / sizeof(u64)]; \ + u64 regs[MAX_BPF_EXT_REG]; \ + struct bpf_trace_ctx trace_ctx; \ +\ + memset(stack, 0, sizeof(stack)); \ + memset(regs, 0, sizeof(regs)); \ +\ + FP = (u64) (unsigned long) &stack[ARRAY_SIZE(stack)]; \ + ARG1 = (u64) (unsigned long) ctx; \ +\ + __bpf_trace_init(&trace_ctx, insn, regs, stack, stack_size); \ + ret = ___bpf_prog_run_trace(&trace_ctx, regs, insn, stack); \ + __bpf_trace_complete(&trace_ctx, ret); \ + return ret; \ +} + #define PROG_NAME_ARGS(stack_size) __bpf_prog_run_args##stack_size #define DEFINE_BPF_PROG_RUN_ARGS(stack_size) \ static u64 PROG_NAME_ARGS(stack_size)(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5, \ @@ -1670,16 +2316,20 @@ static u64 PROG_NAME_ARGS(stack_size)(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5, \ #define EVAL5(FN, X, Y...) FN(X) EVAL4(FN, Y) #define EVAL6(FN, X, Y...) FN(X) EVAL5(FN, Y) -EVAL6(DEFINE_BPF_PROG_RUN, 32, 64, 96, 128, 160, 192); -EVAL6(DEFINE_BPF_PROG_RUN, 224, 256, 288, 320, 352, 384); -EVAL4(DEFINE_BPF_PROG_RUN, 416, 448, 480, 512); +EVAL6(DEFINE_BPF_PROG_RUN_TRACE, 32, 64, 96, 128, 160, 192); +EVAL6(DEFINE_BPF_PROG_RUN_TRACE, 224, 256, 288, 320, 352, 384); +EVAL4(DEFINE_BPF_PROG_RUN_TRACE, 416, 448, 480, 512); EVAL6(DEFINE_BPF_PROG_RUN_ARGS, 32, 64, 96, 128, 160, 192); EVAL6(DEFINE_BPF_PROG_RUN_ARGS, 224, 256, 288, 320, 352, 384); EVAL4(DEFINE_BPF_PROG_RUN_ARGS, 416, 448, 480, 512); -#define PROG_NAME_LIST(stack_size) PROG_NAME(stack_size), +#ifndef CONFIG_BPF_JIT_ALWAYS_ON +EVAL6(DEFINE_BPF_PROG_RUN, 32, 64, 96, 128, 160, 192); +EVAL6(DEFINE_BPF_PROG_RUN, 224, 256, 288, 320, 352, 384); +EVAL4(DEFINE_BPF_PROG_RUN, 416, 448, 480, 512); +#define PROG_NAME_LIST(stack_size) PROG_NAME(stack_size), static unsigned int (*interpreters[])(const void *ctx, const struct bpf_insn *insn) = { EVAL6(PROG_NAME_LIST, 32, 64, 96, 128, 160, 192) @@ -1687,6 +2337,17 @@ EVAL6(PROG_NAME_LIST, 224, 256, 288, 320, 352, 384) EVAL4(PROG_NAME_LIST, 416, 448, 480, 512) }; #undef PROG_NAME_LIST +#endif + +#define PROG_NAME_LIST(stack_size) PROG_NAME_TRACE(stack_size), +static unsigned int (*interpreters_trace[])(const void *ctx, + const struct bpf_insn *insn) = { +EVAL6(PROG_NAME_LIST, 32, 64, 96, 128, 160, 192) +EVAL6(PROG_NAME_LIST, 224, 256, 288, 320, 352, 384) +EVAL4(PROG_NAME_LIST, 416, 448, 480, 512) +}; +#undef PROG_NAME_LIST + #define PROG_NAME_LIST(stack_size) PROG_NAME_ARGS(stack_size), static u64 (*interpreters_args[])(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5, const struct bpf_insn *insn) = { @@ -1696,6 +2357,12 @@ EVAL4(PROG_NAME_LIST, 416, 448, 480, 512) }; #undef PROG_NAME_LIST +bpf_func_t bpf_get_trace_interpreter(const struct bpf_prog *fp) +{ + u32 stack_depth = max_t(u32, fp->aux->stack_depth, 1); + return interpreters_trace[(round_up(stack_depth, 32) / 32) - 1]; +} + void bpf_patch_call_args(struct bpf_insn *insn, u32 stack_depth) { stack_depth = max_t(u32, stack_depth, 1); @@ -1705,7 +2372,7 @@ void bpf_patch_call_args(struct bpf_insn *insn, u32 stack_depth) insn->code = BPF_JMP | BPF_CALL_ARGS; } -#else +#ifdef CONFIG_BPF_JIT_ALWAYS_ON static unsigned int __bpf_prog_ret0_warn(const void *ctx, const struct bpf_insn *insn) { diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 55d376c53f7d..d521c5ad8111 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -2720,7 +2720,6 @@ static int check_max_stack_depth(struct bpf_verifier_env *env) goto continue_func; } -#ifndef CONFIG_BPF_JIT_ALWAYS_ON static int get_callee_stack_depth(struct bpf_verifier_env *env, const struct bpf_insn *insn, int idx) { @@ -2734,7 +2733,6 @@ static int get_callee_stack_depth(struct bpf_verifier_env *env, } return env->subprog_info[subprog].stack_depth; } -#endif int check_ctx_reg(struct bpf_verifier_env *env, const struct bpf_reg_state *reg, int regno) @@ -9158,11 +9156,9 @@ static int jit_subprogs(struct bpf_verifier_env *env) static int fixup_call_args(struct bpf_verifier_env *env) { -#ifndef CONFIG_BPF_JIT_ALWAYS_ON struct bpf_prog *prog = env->prog; struct bpf_insn *insn = prog->insnsi; int i, depth; -#endif int err = 0; if (env->prog->jit_requested && @@ -9173,7 +9169,7 @@ static int fixup_call_args(struct bpf_verifier_env *env) if (err == -EFAULT) return err; } -#ifndef CONFIG_BPF_JIT_ALWAYS_ON + for (i = 0; i < prog->len; i++, insn++) { if (insn->code != (BPF_JMP | BPF_CALL) || insn->src_reg != BPF_PSEUDO_CALL) @@ -9183,9 +9179,7 @@ static int fixup_call_args(struct bpf_verifier_env *env) return depth; bpf_patch_call_args(insn, depth); } - err = 0; -#endif - return err; + return 0; } /* fixup insn->imm field of bpf_call instructions From patchwork Wed Mar 18 13:06:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eelco Chaudron X-Patchwork-Id: 1257487 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=RlgZ3/k7; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 48j9KY1gB2z9sNg for ; Thu, 19 Mar 2020 00:06:41 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726893AbgCRNGk (ORCPT ); Wed, 18 Mar 2020 09:06:40 -0400 Received: from us-smtp-delivery-74.mimecast.com ([63.128.21.74]:36390 "EHLO us-smtp-delivery-74.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726916AbgCRNGk (ORCPT ); Wed, 18 Mar 2020 09:06:40 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1584536799; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Ka1gLTJKqMcefNuAj8Jp4rzkqApYStpQB6I5WNUUMAU=; b=RlgZ3/k7k377sfYhlWXj6VEzmFy07R49yfXru9t7FOT77tTwJS3z3j7UespXAAtXwtKIEZ y1xcfW77aO1kTaFA4+x3Pkz2e4TxV1WG450AwiKASCdxyGgXXMGRjpPTsslgxScLapU3mD QsuOOo5LOe5mOnfWra4eum6PtcsOv04= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-266-znNt6VRBNUGkbKYhK1bnnA-1; Wed, 18 Mar 2020 09:06:38 -0400 X-MC-Unique: znNt6VRBNUGkbKYhK1bnnA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 5AFD875ED1; Wed, 18 Mar 2020 13:06:32 +0000 (UTC) Received: from localhost.localdomain (wsfd-netdev76.ntdv.lab.eng.bos.redhat.com [10.19.188.157]) by smtp.corp.redhat.com (Postfix) with ESMTP id 83D9E60BFB; Wed, 18 Mar 2020 13:06:30 +0000 (UTC) From: Eelco Chaudron To: bpf@vger.kernel.org Cc: davem@davemloft.net, netdev@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net, kafai@fb.com, songliubraving@fb.com, yhs@fb.com, andriin@fb.com Subject: [RFC PATCH bpf-next 3/3] selftests/bpf: call bpf_prog_test_run with trace enabled for XDP test Date: Wed, 18 Mar 2020 13:06:27 +0000 Message-Id: <158453678749.3043.9870294925481791434.stgit@xdp-tutorial> In-Reply-To: <158453675319.3043.5779623595270458781.stgit@xdp-tutorial> References: <158453675319.3043.5779623595270458781.stgit@xdp-tutorial> User-Agent: StGit/0.19 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org This adds a quick and dirty version of bpf_prog_test_run_trace() right in the xdp.c selftest to demonstrate the new trace functionality. This is a (too long) example output of the trace functionality: STACK(32 bytes): ffffb82b4081faa8: 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 REGISTERS: r00-03: 0x0000000000000000 ffffb82b4081fde8 0000000000000000 0000000000000000 r04-07: 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 r08-11: 0x0000000000000000 0000000000000000 ffffb82b4081fac8 0000000000000000 1[0x0]: (bf) r6 = r1 r6: 0x0000000000000000 -> 0xffffb82b4081fde8 2[0x1]: (b4) w8 = 1 r8: 0x0000000000000000 -> 0x0000000000000001 3[0x2]: (b7) r11 = -439116112 r11: 0x0000000000000000 -> 0xffffffffe5d39eb0 4[0x3]: (67) r11 <<= 32 r11: 0xffffffffe5d39eb0 -> 0xe5d39eb000000000 5[0x4]: (4f) r8 |= r11 r8: 0x0000000000000001 -> 0xe5d39eb000000001 6[0x5]: (79) r2 = *(u64 *)(r6 +8) r2: 0x0000000000000000 -> 0xffff8fbaec6d3b36 7[0x6]: (79) r1 = *(u64 *)(r6 +0) r1: 0xffffb82b4081fde8 -> 0xffff8fbaec6d3b00 8[0x7]: (bf) r3 = r1 r3: 0x0000000000000000 -> 0xffff8fbaec6d3b00 9[0x8]: (07) r3 += 14 r3: 0xffff8fbaec6d3b00 -> 0xffff8fbaec6d3b0e 10[0x9]: (2d) if r3 > r2 goto pc+76 r3: 0xffff8fbaec6d3b0e <==> r2: 0xffff8fbaec6d3b36 branch was NOT taken ... ... 300[0x152]: (a4) w2 ^= -1 r2: 0x6ec5cf350000934d -> 0x00000000ffff6cb2 301[0x153]: (b7) r11 = -412759113 r11: 0x6ec5cf3500000000 -> 0xffffffffe765cbb7 302[0x154]: (67) r11 <<= 32 r11: 0xffffffffe765cbb7 -> 0xe765cbb700000000 303[0x155]: (4f) r2 |= r11 r2: 0x00000000ffff6cb2 -> 0xe765cbb7ffff6cb2 304[0x156]: (6b) *(u16 *)(r1 +24) = r2 MEM[ffff8fbaec6d3b04]: 0x00 -> 0xb2 305[0x157]: (71) r1 = *(u8 *)(r10 -12) r1: 0xffff8fbaec6d3aec -> 0x0000000000000006 306[0x158]: (b7) r11 = -1438054225 r11: 0xe765cbb700000000 -> 0xffffffffaa4908af 307[0x159]: (67) r11 <<= 32 r11: 0xffffffffaa4908af -> 0xaa4908af00000000 308[0x15a]: (4f) r1 |= r11 r1: 0x0000000000000006 -> 0xaa4908af00000006 309[0x15b]: (63) *(u32 *)(r10 -4) = r1 MEM[ffffb82b4081fac4]: 0x0000000000000000 -> 0x0000000000000006 310[0x15c]: (bf) r2 = r10 r2: 0xe765cbb7ffff6cb2 -> 0xffffb82b4081fac8 311[0x15d]: (07) r2 += -4 r2: 0xffffb82b4081fac8 -> 0xffffb82b4081fac4 312[0x15e]: (18) r1 = 0xffff8fbaeef8c000 r1: 0xaa4908af00000006 -> 0xffff8fbaeef8c000 313[0x160]: (85) call percpu_array_map_lookup_elem#164336 r0: 0x7af818e200000000 -> 0xffffd82b3fbb4048 314[0x161]: (15) if r0 == 0x0 goto pc+229 r0: 0xffffd82b3fbb4048 <==> 0x000000 branch was NOT taken 315[0x162]: (05) goto pc+225 pc: 0x0000000000000162 -> 0x0000000000000244 branch was taken 316[0x244]: (79) r1 = *(u64 *)(r0 +0) r1: 0xffff8fbaeef8c000 -> 0x0000000000000000 317[0x245]: (07) r1 += 1 r1: 0x0000000000000000 -> 0x0000000000000001 318[0x246]: (7b) *(u64 *)(r0 +0) = r1 MEM[ffffd82b3fbb4048]: 0x00 -> 0x01 319[0x247]: (b4) w8 = 3 r8: 0x14d492f700000045 -> 0x0000000000000003 320[0x248]: (b7) r11 = -1474974291 r11: 0xaa4908af00000000 -> 0xffffffffa815adad 321[0x249]: (67) r11 <<= 32 r11: 0xffffffffa815adad -> 0xa815adad00000000 322[0x24a]: (4f) r8 |= r11 r8: 0x0000000000000003 -> 0xa815adad00000003 323[0x24b]: (05) goto pc-502 pc: 0x000000000000024b -> 0x0000000000000056 branch was taken 324[0x56]: (bc) w0 = w8 r0: 0xffffd82b3fbb4048 -> 0x0000000000000003 325[0x57]: (95) exit STACK(32 bytes): ffffb82b4081faa8: 0x0000000000000000 0000000000000000 0000000600020000 0000000600000000 REGISTERS: r00-03: 0x0000000000000003 0000000000000001 ffffb82b4081fac4 9b5fd9b800000000 r04-07: 0x03f00a9d00000000 2c071e5a00000000 7437bef800000000 ffff8fbaf4b5b6f8 r08-11: 0xa815adad00000003 be1b210e0000934d ffffb82b4081fac8 a815adad00000000 RESULT: return = 0x0000000000000003 Signed-off-by: Eelco Chaudron --- tools/testing/selftests/bpf/prog_tests/xdp.c | 36 +++++++++++++++++++++++++- 1 file changed, 34 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/bpf/prog_tests/xdp.c b/tools/testing/selftests/bpf/prog_tests/xdp.c index dcb5ecac778e..8f7a70d39dbf 100644 --- a/tools/testing/selftests/bpf/prog_tests/xdp.c +++ b/tools/testing/selftests/bpf/prog_tests/xdp.c @@ -1,6 +1,38 @@ // SPDX-License-Identifier: GPL-2.0 #include + +static inline int sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr, + unsigned int size) +{ + return syscall(__NR_bpf, cmd, attr, size); +} + +int bpf_prog_test_run_trace(int prog_fd, int repeat, void *data, __u32 size, + void *data_out, __u32 *size_out, __u32 *retval, + __u32 *duration) +{ + union bpf_attr attr; + int ret; + + memset(&attr, 0, sizeof(attr)); + attr.test.prog_fd = prog_fd; + attr.test.data_in = ptr_to_u64(data); + attr.test.data_out = ptr_to_u64(data_out); + attr.test.data_size_in = size; + attr.test.repeat = repeat; + attr.test.flags = BPF_F_TEST_ENABLE_TRACE; + + ret = sys_bpf(BPF_PROG_TEST_RUN, &attr, sizeof(attr)); + if (size_out) + *size_out = attr.test.data_size_out; + if (retval) + *retval = attr.test.retval; + if (duration) + *duration = attr.test.duration; + return ret; +} + void test_xdp(void) { struct vip key4 = {.protocol = 6, .family = AF_INET}; @@ -25,7 +57,7 @@ void test_xdp(void) bpf_map_update_elem(map_fd, &key4, &value4, 0); bpf_map_update_elem(map_fd, &key6, &value6, 0); - err = bpf_prog_test_run(prog_fd, 1, &pkt_v4, sizeof(pkt_v4), + err = bpf_prog_test_run_trace(prog_fd, 1, &pkt_v4, sizeof(pkt_v4), buf, &size, &retval, &duration); CHECK(err || retval != XDP_TX || size != 74 || @@ -33,7 +65,7 @@ void test_xdp(void) "err %d errno %d retval %d size %d\n", err, errno, retval, size); - err = bpf_prog_test_run(prog_fd, 1, &pkt_v6, sizeof(pkt_v6), + err = bpf_prog_test_run_trace(prog_fd, 1, &pkt_v6, sizeof(pkt_v6), buf, &size, &retval, &duration); CHECK(err || retval != XDP_TX || size != 114 || iph6->nexthdr != IPPROTO_IPV6, "ipv6",