Message ID | 20191209000114.1876138-2-ast@kernel.org |
---|---|
State | Accepted |
Delegated to: | BPF Maintainers |
Headers | show |
Series | bpf: Make BPF trampoline friendly to ftrace | expand |
On Sun, Dec 8, 2019 at 4:03 PM Alexei Starovoitov <ast@kernel.org> wrote: > > Depending on type of BPF programs served by BPF trampoline it can call original > function. In such case the trampoline will skip one stack frame while > returning. That will confuse function_graph tracer and will cause crashes with > bad RIP. Teach graph tracer to skip functions that have BPF trampoline attached. > > Signed-off-by: Alexei Starovoitov <ast@kernel.org> Steven, please take a look.
On Tue, 10 Dec 2019 08:19:42 -0800 Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > On Sun, Dec 8, 2019 at 4:03 PM Alexei Starovoitov <ast@kernel.org> wrote: > > > > Depending on type of BPF programs served by BPF trampoline it can call original > > function. In such case the trampoline will skip one stack frame while > > returning. That will confuse function_graph tracer and will cause crashes with > > bad RIP. Teach graph tracer to skip functions that have BPF trampoline attached. > > > > Signed-off-by: Alexei Starovoitov <ast@kernel.org> > > Steven, please take a look. I'll try to get to it today or tomorrow. I have some other work to get done that my job requires I do ;-) -- Steve
On Sun, 8 Dec 2019 16:01:12 -0800 Alexei Starovoitov <ast@kernel.org> wrote: > #ifndef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS > diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c > index 67e0c462b059..a2659735db73 100644 > --- a/kernel/trace/fgraph.c > +++ b/kernel/trace/fgraph.c > @@ -101,6 +101,15 @@ int function_graph_enter(unsigned long ret, unsigned long func, > { > struct ftrace_graph_ent trace; > > + /* > + * Skip graph tracing if the return location is served by direct trampoline, > + * since call sequence and return addresses is unpredicatable anymore. > + * Ex: BPF trampoline may call original function and may skip frame > + * depending on type of BPF programs attached. > + */ > + if (ftrace_direct_func_count && > + ftrace_find_rec_direct(ret - MCOUNT_INSN_SIZE)) My only worry is that this may not work for all archs that implement it. But I figure we can cross that bridge when we get to it. > + return -EBUSY; > trace.func = func; > trace.depth = ++current->curr_ret_depth; > I added this patch to my queue and it's about 70% done going through my test suite (takes around 10 - 13 hours). As I'm about to send a pull request to Linus tomorrow, I could include this patch (as it will be fully tested), and then you could apply the other two when it hits Linus's tree. Would that work for you? -- Steve
On Tue, Dec 10, 2019 at 06:35:19PM -0500, Steven Rostedt wrote: > On Sun, 8 Dec 2019 16:01:12 -0800 > Alexei Starovoitov <ast@kernel.org> wrote: > > > #ifndef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS > > diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c > > index 67e0c462b059..a2659735db73 100644 > > --- a/kernel/trace/fgraph.c > > +++ b/kernel/trace/fgraph.c > > @@ -101,6 +101,15 @@ int function_graph_enter(unsigned long ret, unsigned long func, > > { > > struct ftrace_graph_ent trace; > > > > + /* > > + * Skip graph tracing if the return location is served by direct trampoline, > > + * since call sequence and return addresses is unpredicatable anymore. > > + * Ex: BPF trampoline may call original function and may skip frame > > + * depending on type of BPF programs attached. > > + */ > > + if (ftrace_direct_func_count && > > + ftrace_find_rec_direct(ret - MCOUNT_INSN_SIZE)) > > My only worry is that this may not work for all archs that implement > it. But I figure we can cross that bridge when we get to it. Right. Since bpf trampoline is going to be the only user in short term it's not an issue, since trampoline is x86-64 only so far. > > + return -EBUSY; > > trace.func = func; > > trace.depth = ++current->curr_ret_depth; > > > > I added this patch to my queue and it's about 70% done going through my > test suite (takes around 10 - 13 hours). > > As I'm about to send a pull request to Linus tomorrow, I could include > this patch (as it will be fully tested), and then you could apply the > other two when it hits Linus's tree. > > Would that work for you? Awesome. Much appreciate additional testing. I can certainly wait another day. I was hoping to get patch 2 all the way to Linus's tree before rc2 to make sure register_ftrace_direct() API is used for real in this kernel cycle. When everything will land I'll backport to our production kernel and then the actual stress testing begins :)
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c index 060a361d9d11..024c3053dbba 100644 --- a/arch/x86/kernel/ftrace.c +++ b/arch/x86/kernel/ftrace.c @@ -1042,20 +1042,6 @@ void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent, if (unlikely(atomic_read(¤t->tracing_graph_pause))) return; - /* - * If the return location is actually pointing directly to - * the start of a direct trampoline (if we trace the trampoline - * it will still be offset by MCOUNT_INSN_SIZE), then the - * return address is actually off by one word, and we - * need to adjust for that. - */ - if (ftrace_direct_func_count) { - if (ftrace_find_direct_func(self_addr + MCOUNT_INSN_SIZE)) { - self_addr = *parent; - parent++; - } - } - /* * Protect against fault, even if it shouldn't * happen. This tool is too much intrusive to diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h index 7247d35c3d16..db95244a62d4 100644 --- a/include/linux/ftrace.h +++ b/include/linux/ftrace.h @@ -264,6 +264,7 @@ int ftrace_modify_direct_caller(struct ftrace_func_entry *entry, struct dyn_ftrace *rec, unsigned long old_addr, unsigned long new_addr); +unsigned long ftrace_find_rec_direct(unsigned long ip); #else # define ftrace_direct_func_count 0 static inline int register_ftrace_direct(unsigned long ip, unsigned long addr) @@ -290,6 +291,10 @@ static inline int ftrace_modify_direct_caller(struct ftrace_func_entry *entry, { return -ENODEV; } +static inline unsigned long ftrace_find_rec_direct(unsigned long ip) +{ + return 0; +} #endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */ #ifndef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c index 67e0c462b059..a2659735db73 100644 --- a/kernel/trace/fgraph.c +++ b/kernel/trace/fgraph.c @@ -101,6 +101,15 @@ int function_graph_enter(unsigned long ret, unsigned long func, { struct ftrace_graph_ent trace; + /* + * Skip graph tracing if the return location is served by direct trampoline, + * since call sequence and return addresses is unpredicatable anymore. + * Ex: BPF trampoline may call original function and may skip frame + * depending on type of BPF programs attached. + */ + if (ftrace_direct_func_count && + ftrace_find_rec_direct(ret - MCOUNT_INSN_SIZE)) + return -EBUSY; trace.func = func; trace.depth = ++current->curr_ret_depth; diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index 74439ab5c2b6..ac99a3500076 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -2364,7 +2364,7 @@ int ftrace_direct_func_count; * Search the direct_functions hash to see if the given instruction pointer * has a direct caller attached to it. */ -static unsigned long find_rec_direct(unsigned long ip) +unsigned long ftrace_find_rec_direct(unsigned long ip) { struct ftrace_func_entry *entry; @@ -2380,7 +2380,7 @@ static void call_direct_funcs(unsigned long ip, unsigned long pip, { unsigned long addr; - addr = find_rec_direct(ip); + addr = ftrace_find_rec_direct(ip); if (!addr) return; @@ -2393,11 +2393,6 @@ struct ftrace_ops direct_ops = { | FTRACE_OPS_FL_DIRECT | FTRACE_OPS_FL_SAVE_REGS | FTRACE_OPS_FL_PERMANENT, }; -#else -static inline unsigned long find_rec_direct(unsigned long ip) -{ - return 0; -} #endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */ /** @@ -2417,7 +2412,7 @@ unsigned long ftrace_get_addr_new(struct dyn_ftrace *rec) if ((rec->flags & FTRACE_FL_DIRECT) && (ftrace_rec_count(rec) == 1)) { - addr = find_rec_direct(rec->ip); + addr = ftrace_find_rec_direct(rec->ip); if (addr) return addr; WARN_ON_ONCE(1); @@ -2458,7 +2453,7 @@ unsigned long ftrace_get_addr_curr(struct dyn_ftrace *rec) /* Direct calls take precedence over trampolines */ if (rec->flags & FTRACE_FL_DIRECT_EN) { - addr = find_rec_direct(rec->ip); + addr = ftrace_find_rec_direct(rec->ip); if (addr) return addr; WARN_ON_ONCE(1); @@ -3604,7 +3599,7 @@ static int t_show(struct seq_file *m, void *v) if (rec->flags & FTRACE_FL_DIRECT) { unsigned long direct; - direct = find_rec_direct(rec->ip); + direct = ftrace_find_rec_direct(rec->ip); if (direct) seq_printf(m, "\n\tdirect-->%pS", (void *)direct); } @@ -5008,7 +5003,7 @@ int register_ftrace_direct(unsigned long ip, unsigned long addr) mutex_lock(&direct_mutex); /* See if there's a direct function at @ip already */ - if (find_rec_direct(ip)) + if (ftrace_find_rec_direct(ip)) goto out_unlock; ret = -ENODEV; @@ -5027,7 +5022,7 @@ int register_ftrace_direct(unsigned long ip, unsigned long addr) if (ip != rec->ip) { ip = rec->ip; /* Need to check this ip for a direct. */ - if (find_rec_direct(ip)) + if (ftrace_find_rec_direct(ip)) goto out_unlock; }
Depending on type of BPF programs served by BPF trampoline it can call original function. In such case the trampoline will skip one stack frame while returning. That will confuse function_graph tracer and will cause crashes with bad RIP. Teach graph tracer to skip functions that have BPF trampoline attached. Signed-off-by: Alexei Starovoitov <ast@kernel.org> --- arch/x86/kernel/ftrace.c | 14 -------------- include/linux/ftrace.h | 5 +++++ kernel/trace/fgraph.c | 9 +++++++++ kernel/trace/ftrace.c | 19 +++++++------------ 4 files changed, 21 insertions(+), 26 deletions(-)