Message ID | 05e021f757625cbbb006fad41380323dbe4e3b43.1562249521.git.naveen.n.rao@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | ftrace: two fixes with func_probes handling | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/apply_patch | success | Successfully applied on branch next (f531d5e8f55b3767217b5d1be0ce1f1acd10167c) |
snowpatch_ozlabs/checkpatch | warning | total: 0 errors, 1 warnings, 0 checks, 10 lines checked |
On Thu, 4 Jul 2019 20:04:41 +0530 "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote: > LTP testsuite on powerpc results in the below crash: > > Unable to handle kernel paging request for data at address 0x00000000 > Faulting instruction address: 0xc00000000029d800 > Oops: Kernel access of bad area, sig: 11 [#1] > LE SMP NR_CPUS=2048 NUMA PowerNV > ... > CPU: 68 PID: 96584 Comm: cat Kdump: loaded Tainted: G W > NIP: c00000000029d800 LR: c00000000029dac4 CTR: c0000000001e6ad0 > REGS: c0002017fae8ba10 TRAP: 0300 Tainted: G W > MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 28022422 XER: 20040000 > CFAR: c00000000029d90c DAR: 0000000000000000 DSISR: 40000000 IRQMASK: 0 > ... > NIP [c00000000029d800] t_probe_next+0x60/0x180 > LR [c00000000029dac4] t_mod_start+0x1a4/0x1f0 > Call Trace: > [c0002017fae8bc90] [c000000000cdbc40] _cond_resched+0x10/0xb0 (unreliable) > [c0002017fae8bce0] [c0000000002a15b0] t_start+0xf0/0x1c0 > [c0002017fae8bd30] [c0000000004ec2b4] seq_read+0x184/0x640 > [c0002017fae8bdd0] [c0000000004a57bc] sys_read+0x10c/0x300 > [c0002017fae8be30] [c00000000000b388] system_call+0x5c/0x70 > > The test (ftrace_set_ftrace_filter.sh) is part of ftrace stress tests > and the crash happens when the test does 'cat > $TRACING_PATH/set_ftrace_filter'. > > The address points to the second line below, in t_probe_next(), where > filter_hash is dereferenced: > hash = iter->probe->ops.func_hash->filter_hash; > size = 1 << hash->size_bits; > > This happens due to a race with register_ftrace_function_probe(). A new > ftrace_func_probe is created and added into the func_probes list in > trace_array under ftrace_lock. However, before initializing the filter, > we drop ftrace_lock, and re-acquire it after acquiring regex_lock. If > another process is trying to read set_ftrace_filter, it will be able to > acquire ftrace_lock during this window and it will end up seeing a NULL > filter_hash. > > Fix this by just checking for a NULL filter_hash in t_probe_next(). If > the filter_hash is NULL, then this probe is just being added and we can > simply return from here. Hmm, this is very subtle. I'll take a deeper look at this to see if we can keep the race from happening. Thanks! -- Steve > > Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> > --- > kernel/trace/ftrace.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c > index 7b037295a1f1..0791eafb693d 100644 > --- a/kernel/trace/ftrace.c > +++ b/kernel/trace/ftrace.c > @@ -3093,6 +3093,10 @@ t_probe_next(struct seq_file *m, loff_t *pos) > hnd = &iter->probe_entry->hlist; > > hash = iter->probe->ops.func_hash->filter_hash; > + > + if (!hash) > + return NULL; > + > size = 1 << hash->size_bits; > > retry:
On Thu, 4 Jul 2019 20:04:41 +0530 "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote: > kernel/trace/ftrace.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c > index 7b037295a1f1..0791eafb693d 100644 > --- a/kernel/trace/ftrace.c > +++ b/kernel/trace/ftrace.c > @@ -3093,6 +3093,10 @@ t_probe_next(struct seq_file *m, loff_t *pos) > hnd = &iter->probe_entry->hlist; > > hash = iter->probe->ops.func_hash->filter_hash; > + > + if (!hash) > + return NULL; > + > size = 1 << hash->size_bits; > > retry: OK, I added this, but I'm also adding this on top: -- Steve From 372e0d01da71c84dcecf7028598a33813b0d5256 Mon Sep 17 00:00:00 2001 From: "Steven Rostedt (VMware)" <rostedt@goodmis.org> Date: Fri, 30 Aug 2019 16:30:01 -0400 Subject: [PATCH] ftrace: Check for empty hash and comment the race with registering probes The race between adding a function probe and reading the probes that exist is very subtle. It needs a comment. Also, the issue can also happen if the probe has has the EMPTY_HASH as its func_hash. Cc: stable@vger.kernel.org Fixes: 7b60f3d876156 ("ftrace: Dynamically create the probe ftrace_ops for the trace_array") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> --- kernel/trace/ftrace.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index 80beed2cf0da..6200a6fe10e3 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -3096,7 +3096,11 @@ t_probe_next(struct seq_file *m, loff_t *pos) hash = iter->probe->ops.func_hash->filter_hash; - if (!hash) + /* + * A probe being registered may temporarily have an empty hash + * and it's at the end of the func_probes list. + */ + if (!hash || hash == EMPTY_HASH) return NULL; size = 1 << hash->size_bits; @@ -4324,6 +4328,10 @@ register_ftrace_function_probe(char *glob, struct trace_array *tr, mutex_unlock(&ftrace_lock); + /* + * Note, there's a small window here that the func_hash->filter_hash + * may be NULL or empty. Need to be carefule when reading the loop. + */ mutex_lock(&probe->ops.func_hash->regex_lock); orig_hash = &probe->ops.func_hash->filter_hash;
Steven Rostedt wrote: > On Thu, 4 Jul 2019 20:04:41 +0530 > "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote: > > >> kernel/trace/ftrace.c | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c >> index 7b037295a1f1..0791eafb693d 100644 >> --- a/kernel/trace/ftrace.c >> +++ b/kernel/trace/ftrace.c >> @@ -3093,6 +3093,10 @@ t_probe_next(struct seq_file *m, loff_t *pos) >> hnd = &iter->probe_entry->hlist; >> >> hash = iter->probe->ops.func_hash->filter_hash; >> + >> + if (!hash) >> + return NULL; >> + >> size = 1 << hash->size_bits; >> >> retry: > > OK, I added this, but I'm also adding this on top: Thanks, the additional comments do make this much clearer. Regards, Naveen
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index 7b037295a1f1..0791eafb693d 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -3093,6 +3093,10 @@ t_probe_next(struct seq_file *m, loff_t *pos) hnd = &iter->probe_entry->hlist; hash = iter->probe->ops.func_hash->filter_hash; + + if (!hash) + return NULL; + size = 1 << hash->size_bits; retry:
LTP testsuite on powerpc results in the below crash: Unable to handle kernel paging request for data at address 0x00000000 Faulting instruction address: 0xc00000000029d800 Oops: Kernel access of bad area, sig: 11 [#1] LE SMP NR_CPUS=2048 NUMA PowerNV ... CPU: 68 PID: 96584 Comm: cat Kdump: loaded Tainted: G W NIP: c00000000029d800 LR: c00000000029dac4 CTR: c0000000001e6ad0 REGS: c0002017fae8ba10 TRAP: 0300 Tainted: G W MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 28022422 XER: 20040000 CFAR: c00000000029d90c DAR: 0000000000000000 DSISR: 40000000 IRQMASK: 0 ... NIP [c00000000029d800] t_probe_next+0x60/0x180 LR [c00000000029dac4] t_mod_start+0x1a4/0x1f0 Call Trace: [c0002017fae8bc90] [c000000000cdbc40] _cond_resched+0x10/0xb0 (unreliable) [c0002017fae8bce0] [c0000000002a15b0] t_start+0xf0/0x1c0 [c0002017fae8bd30] [c0000000004ec2b4] seq_read+0x184/0x640 [c0002017fae8bdd0] [c0000000004a57bc] sys_read+0x10c/0x300 [c0002017fae8be30] [c00000000000b388] system_call+0x5c/0x70 The test (ftrace_set_ftrace_filter.sh) is part of ftrace stress tests and the crash happens when the test does 'cat $TRACING_PATH/set_ftrace_filter'. The address points to the second line below, in t_probe_next(), where filter_hash is dereferenced: hash = iter->probe->ops.func_hash->filter_hash; size = 1 << hash->size_bits; This happens due to a race with register_ftrace_function_probe(). A new ftrace_func_probe is created and added into the func_probes list in trace_array under ftrace_lock. However, before initializing the filter, we drop ftrace_lock, and re-acquire it after acquiring regex_lock. If another process is trying to read set_ftrace_filter, it will be able to acquire ftrace_lock during this window and it will end up seeing a NULL filter_hash. Fix this by just checking for a NULL filter_hash in t_probe_next(). If the filter_hash is NULL, then this probe is just being added and we can simply return from here. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> --- kernel/trace/ftrace.c | 4 ++++ 1 file changed, 4 insertions(+)