[bpf,1/3] ftrace: Fix function_graph tracer interaction with BPF trampoline
diff mbox series

Message ID 20191209000114.1876138-2-ast@kernel.org
State Accepted
Delegated to: BPF Maintainers
Headers show
Series
  • bpf: Make BPF trampoline friendly to ftrace
Related show

Commit Message

Alexei Starovoitov Dec. 9, 2019, 12:01 a.m. UTC
Depending on type of BPF programs served by BPF trampoline it can call original
function. In such case the trampoline will skip one stack frame while
returning. That will confuse function_graph tracer and will cause crashes with
bad RIP. Teach graph tracer to skip functions that have BPF trampoline attached.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 arch/x86/kernel/ftrace.c | 14 --------------
 include/linux/ftrace.h   |  5 +++++
 kernel/trace/fgraph.c    |  9 +++++++++
 kernel/trace/ftrace.c    | 19 +++++++------------
 4 files changed, 21 insertions(+), 26 deletions(-)

Comments

Alexei Starovoitov Dec. 10, 2019, 4:19 p.m. UTC | #1
On Sun, Dec 8, 2019 at 4:03 PM Alexei Starovoitov <ast@kernel.org> wrote:
>
> Depending on type of BPF programs served by BPF trampoline it can call original
> function. In such case the trampoline will skip one stack frame while
> returning. That will confuse function_graph tracer and will cause crashes with
> bad RIP. Teach graph tracer to skip functions that have BPF trampoline attached.
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Steven, please take a look.
Steven Rostedt Dec. 10, 2019, 4:30 p.m. UTC | #2
On Tue, 10 Dec 2019 08:19:42 -0800
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> On Sun, Dec 8, 2019 at 4:03 PM Alexei Starovoitov <ast@kernel.org> wrote:
> >
> > Depending on type of BPF programs served by BPF trampoline it can call original
> > function. In such case the trampoline will skip one stack frame while
> > returning. That will confuse function_graph tracer and will cause crashes with
> > bad RIP. Teach graph tracer to skip functions that have BPF trampoline attached.
> >
> > Signed-off-by: Alexei Starovoitov <ast@kernel.org>  
> 
> Steven, please take a look.

I'll try to get to it today or tomorrow. I have some other work to get
done that my job requires I do ;-)

-- Steve
Steven Rostedt Dec. 10, 2019, 11:35 p.m. UTC | #3
On Sun, 8 Dec 2019 16:01:12 -0800
Alexei Starovoitov <ast@kernel.org> wrote:

>  #ifndef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
> diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
> index 67e0c462b059..a2659735db73 100644
> --- a/kernel/trace/fgraph.c
> +++ b/kernel/trace/fgraph.c
> @@ -101,6 +101,15 @@ int function_graph_enter(unsigned long ret, unsigned long func,
>  {
>  	struct ftrace_graph_ent trace;
>  
> +	/*
> +	 * Skip graph tracing if the return location is served by direct trampoline,
> +	 * since call sequence and return addresses is unpredicatable anymore.
> +	 * Ex: BPF trampoline may call original function and may skip frame
> +	 * depending on type of BPF programs attached.
> +	 */
> +	if (ftrace_direct_func_count &&
> +	    ftrace_find_rec_direct(ret - MCOUNT_INSN_SIZE))

My only worry is that this may not work for all archs that implement
it. But I figure we can cross that bridge when we get to it.

> +		return -EBUSY;
>  	trace.func = func;
>  	trace.depth = ++current->curr_ret_depth;
>  

I added this patch to my queue and it's about 70% done going through my
test suite (takes around 10 - 13 hours).

As I'm about to send a pull request to Linus tomorrow, I could include
this patch (as it will be fully tested), and then you could apply the
other two when it hits Linus's tree.

Would that work for you?

-- Steve
Alexei Starovoitov Dec. 10, 2019, 11:49 p.m. UTC | #4
On Tue, Dec 10, 2019 at 06:35:19PM -0500, Steven Rostedt wrote:
> On Sun, 8 Dec 2019 16:01:12 -0800
> Alexei Starovoitov <ast@kernel.org> wrote:
> 
> >  #ifndef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
> > diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
> > index 67e0c462b059..a2659735db73 100644
> > --- a/kernel/trace/fgraph.c
> > +++ b/kernel/trace/fgraph.c
> > @@ -101,6 +101,15 @@ int function_graph_enter(unsigned long ret, unsigned long func,
> >  {
> >  	struct ftrace_graph_ent trace;
> >  
> > +	/*
> > +	 * Skip graph tracing if the return location is served by direct trampoline,
> > +	 * since call sequence and return addresses is unpredicatable anymore.
> > +	 * Ex: BPF trampoline may call original function and may skip frame
> > +	 * depending on type of BPF programs attached.
> > +	 */
> > +	if (ftrace_direct_func_count &&
> > +	    ftrace_find_rec_direct(ret - MCOUNT_INSN_SIZE))
> 
> My only worry is that this may not work for all archs that implement
> it. But I figure we can cross that bridge when we get to it.

Right. Since bpf trampoline is going to be the only user in short term
it's not an issue, since trampoline is x86-64 only so far.

> > +		return -EBUSY;
> >  	trace.func = func;
> >  	trace.depth = ++current->curr_ret_depth;
> >  
> 
> I added this patch to my queue and it's about 70% done going through my
> test suite (takes around 10 - 13 hours).
> 
> As I'm about to send a pull request to Linus tomorrow, I could include
> this patch (as it will be fully tested), and then you could apply the
> other two when it hits Linus's tree.
> 
> Would that work for you?

Awesome. Much appreciate additional testing. I can certainly wait another day.
I was hoping to get patch 2 all the way to Linus's tree before rc2 to make sure
register_ftrace_direct() API is used for real in this kernel cycle. When
everything will land I'll backport to our production kernel and then the actual
stress testing begins :)

Patch
diff mbox series

diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 060a361d9d11..024c3053dbba 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -1042,20 +1042,6 @@  void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent,
 	if (unlikely(atomic_read(&current->tracing_graph_pause)))
 		return;
 
-	/*
-	 * If the return location is actually pointing directly to
-	 * the start of a direct trampoline (if we trace the trampoline
-	 * it will still be offset by MCOUNT_INSN_SIZE), then the
-	 * return address is actually off by one word, and we
-	 * need to adjust for that.
-	 */
-	if (ftrace_direct_func_count) {
-		if (ftrace_find_direct_func(self_addr + MCOUNT_INSN_SIZE)) {
-			self_addr = *parent;
-			parent++;
-		}
-	}
-
 	/*
 	 * Protect against fault, even if it shouldn't
 	 * happen. This tool is too much intrusive to
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 7247d35c3d16..db95244a62d4 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -264,6 +264,7 @@  int ftrace_modify_direct_caller(struct ftrace_func_entry *entry,
 				struct dyn_ftrace *rec,
 				unsigned long old_addr,
 				unsigned long new_addr);
+unsigned long ftrace_find_rec_direct(unsigned long ip);
 #else
 # define ftrace_direct_func_count 0
 static inline int register_ftrace_direct(unsigned long ip, unsigned long addr)
@@ -290,6 +291,10 @@  static inline int ftrace_modify_direct_caller(struct ftrace_func_entry *entry,
 {
 	return -ENODEV;
 }
+static inline unsigned long ftrace_find_rec_direct(unsigned long ip)
+{
+	return 0;
+}
 #endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
 
 #ifndef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
index 67e0c462b059..a2659735db73 100644
--- a/kernel/trace/fgraph.c
+++ b/kernel/trace/fgraph.c
@@ -101,6 +101,15 @@  int function_graph_enter(unsigned long ret, unsigned long func,
 {
 	struct ftrace_graph_ent trace;
 
+	/*
+	 * Skip graph tracing if the return location is served by direct trampoline,
+	 * since call sequence and return addresses is unpredicatable anymore.
+	 * Ex: BPF trampoline may call original function and may skip frame
+	 * depending on type of BPF programs attached.
+	 */
+	if (ftrace_direct_func_count &&
+	    ftrace_find_rec_direct(ret - MCOUNT_INSN_SIZE))
+		return -EBUSY;
 	trace.func = func;
 	trace.depth = ++current->curr_ret_depth;
 
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 74439ab5c2b6..ac99a3500076 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2364,7 +2364,7 @@  int ftrace_direct_func_count;
  * Search the direct_functions hash to see if the given instruction pointer
  * has a direct caller attached to it.
  */
-static unsigned long find_rec_direct(unsigned long ip)
+unsigned long ftrace_find_rec_direct(unsigned long ip)
 {
 	struct ftrace_func_entry *entry;
 
@@ -2380,7 +2380,7 @@  static void call_direct_funcs(unsigned long ip, unsigned long pip,
 {
 	unsigned long addr;
 
-	addr = find_rec_direct(ip);
+	addr = ftrace_find_rec_direct(ip);
 	if (!addr)
 		return;
 
@@ -2393,11 +2393,6 @@  struct ftrace_ops direct_ops = {
 			  | FTRACE_OPS_FL_DIRECT | FTRACE_OPS_FL_SAVE_REGS
 			  | FTRACE_OPS_FL_PERMANENT,
 };
-#else
-static inline unsigned long find_rec_direct(unsigned long ip)
-{
-	return 0;
-}
 #endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
 
 /**
@@ -2417,7 +2412,7 @@  unsigned long ftrace_get_addr_new(struct dyn_ftrace *rec)
 
 	if ((rec->flags & FTRACE_FL_DIRECT) &&
 	    (ftrace_rec_count(rec) == 1)) {
-		addr = find_rec_direct(rec->ip);
+		addr = ftrace_find_rec_direct(rec->ip);
 		if (addr)
 			return addr;
 		WARN_ON_ONCE(1);
@@ -2458,7 +2453,7 @@  unsigned long ftrace_get_addr_curr(struct dyn_ftrace *rec)
 
 	/* Direct calls take precedence over trampolines */
 	if (rec->flags & FTRACE_FL_DIRECT_EN) {
-		addr = find_rec_direct(rec->ip);
+		addr = ftrace_find_rec_direct(rec->ip);
 		if (addr)
 			return addr;
 		WARN_ON_ONCE(1);
@@ -3604,7 +3599,7 @@  static int t_show(struct seq_file *m, void *v)
 		if (rec->flags & FTRACE_FL_DIRECT) {
 			unsigned long direct;
 
-			direct = find_rec_direct(rec->ip);
+			direct = ftrace_find_rec_direct(rec->ip);
 			if (direct)
 				seq_printf(m, "\n\tdirect-->%pS", (void *)direct);
 		}
@@ -5008,7 +5003,7 @@  int register_ftrace_direct(unsigned long ip, unsigned long addr)
 	mutex_lock(&direct_mutex);
 
 	/* See if there's a direct function at @ip already */
-	if (find_rec_direct(ip))
+	if (ftrace_find_rec_direct(ip))
 		goto out_unlock;
 
 	ret = -ENODEV;
@@ -5027,7 +5022,7 @@  int register_ftrace_direct(unsigned long ip, unsigned long addr)
 	if (ip != rec->ip) {
 		ip = rec->ip;
 		/* Need to check this ip for a direct. */
-		if (find_rec_direct(ip))
+		if (ftrace_find_rec_direct(ip))
 			goto out_unlock;
 	}