Message ID | 1440656819-25622-1-git-send-email-ast@plumgrid.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
From: Alexei Starovoitov <ast@plumgrid.com> Date: Wed, 26 Aug 2015 23:26:59 -0700 > +/* similar to strncpy_from_user() but with extra checks */ > +static void probe_read_string(char *buf, int size, long unsafe_ptr) > +{ > + char dst[4]; > + int i = 0; > + > + size--; > + for (;;) { > + if (probe_kernel_read(dst, (void *) unsafe_ptr, 4)) > + break; I don't think this does the right thing when the string is not a multiple of 3 and ends at the last byte of a page that ends a valid region of kernel memory. Seeing this kind of error makes me skeptical to the overall value of optimizing this :-/ -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 8/27/15 3:34 PM, David Miller wrote: > From: Alexei Starovoitov <ast@plumgrid.com> > Date: Wed, 26 Aug 2015 23:26:59 -0700 > >> +/* similar to strncpy_from_user() but with extra checks */ >> +static void probe_read_string(char *buf, int size, long unsafe_ptr) >> +{ >> + char dst[4]; >> + int i = 0; >> + >> + size--; >> + for (;;) { >> + if (probe_kernel_read(dst, (void *) unsafe_ptr, 4)) >> + break; > > I don't think this does the right thing when the string is not a multiple > of 3 and ends at the last byte of a page that ends a valid region of > kernel memory. > > Seeing this kind of error makes me skeptical to the overall value of > optimizing this :-/ I've considered the case when first two bytes are valid, but the other two are in a different page. In such case the probe_read_string() will trim the string and won't be printing these two valid bytes. I think that's very rare, so I'm picking higher performance for common case. The strings over > 64 bytes also will be trimmed to 64. It's a debugging facility, so I felt that's ok. Fair or you still think it should be per byte copy? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Alexei Starovoitov <ast@plumgrid.com> Date: Thu, 27 Aug 2015 16:06:14 -0700 > Fair or you still think it should be per byte copy? I'm terribly surprised we don't have an equivalent of strncpy() for unsafe kernel pointers. You probably won't be the last person to want this, and it's silly to optimize it in one place and then wait for cut&paste into the next guy. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 27 Aug 2015 16:20:39 -0700 (PDT) David Miller <davem@davemloft.net> wrote: > From: Alexei Starovoitov <ast@plumgrid.com> > Date: Thu, 27 Aug 2015 16:06:14 -0700 > > > Fair or you still think it should be per byte copy? > > I'm terribly surprised we don't have an equivalent of strncpy() > for unsafe kernel pointers. > > You probably won't be the last person to want this, and it's silly > to optimize it in one place and then wait for cut&paste into the > next guy. If it doesn't exist. Perhaps its time to create it. -- Steve -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 8/27/15 4:43 PM, Steven Rostedt wrote: > On Thu, 27 Aug 2015 16:20:39 -0700 (PDT) > David Miller <davem@davemloft.net> wrote: > >> From: Alexei Starovoitov <ast@plumgrid.com> >> Date: Thu, 27 Aug 2015 16:06:14 -0700 >> >>> Fair or you still think it should be per byte copy? >> >> I'm terribly surprised we don't have an equivalent of strncpy() >> for unsafe kernel pointers. >> >> You probably won't be the last person to want this, and it's silly >> to optimize it in one place and then wait for cut&paste into the >> next guy. > > If it doesn't exist. Perhaps its time to create it. all makes sense. Working on generalizing FETCH_FUNC_NAME from trace_kprobe.c. Seems to fit quite well. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index ef9936df1b04..60d8f95258ed 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -79,13 +79,51 @@ static const struct bpf_func_proto bpf_probe_read_proto = { .arg3_type = ARG_ANYTHING, }; +static bool is_valid_char(char c) +{ + return (isprint(c) || isspace(c)) && isascii(c); +} + +/* similar to strncpy_from_user() but with extra checks */ +static void probe_read_string(char *buf, int size, long unsafe_ptr) +{ + char dst[4]; + int i = 0; + + size--; + for (;;) { + if (probe_kernel_read(dst, (void *) unsafe_ptr, 4)) + break; + + unsafe_ptr += 4; + + if (dst[0] == 0 || !is_valid_char(dst[0]) || i >= size) + break; + buf[i++] = dst[0]; + + if (dst[1] == 0 || !is_valid_char(dst[1]) || i >= size) + break; + buf[i++] = dst[1]; + + if (dst[2] == 0 || !is_valid_char(dst[2]) || i >= size) + break; + buf[i++] = dst[2]; + + if (dst[3] == 0 || !is_valid_char(dst[3]) || i >= size) + break; + buf[i++] = dst[3]; + } + buf[i] = 0; +} + /* * limited trace_printk() - * only %d %u %x %ld %lu %lx %lld %llu %llx %p conversion specifiers allowed + * only %d %u %x %ld %lu %lx %lld %llu %llx %p %s conversion specifiers allowed */ static u64 bpf_trace_printk(u64 r1, u64 fmt_size, u64 r3, u64 r4, u64 r5) { char *fmt = (char *) (long) r1; + char buf[64]; int mod[3] = {}; int fmt_cnt = 0; int i; @@ -100,7 +138,7 @@ static u64 bpf_trace_printk(u64 r1, u64 fmt_size, u64 r3, u64 r4, u64 r5) /* check format string for allowed specifiers */ for (i = 0; i < fmt_size; i++) { - if ((!isprint(fmt[i]) && !isspace(fmt[i])) || !isascii(fmt[i])) + if (!is_valid_char(fmt[i])) return -EINVAL; if (fmt[i] != '%') @@ -114,12 +152,28 @@ static u64 bpf_trace_printk(u64 r1, u64 fmt_size, u64 r3, u64 r4, u64 r5) if (fmt[i] == 'l') { mod[fmt_cnt]++; i++; - } else if (fmt[i] == 'p') { + } else if (fmt[i] == 'p' || fmt[i] == 's') { mod[fmt_cnt]++; i++; if (!isspace(fmt[i]) && !ispunct(fmt[i]) && fmt[i] != 0) return -EINVAL; fmt_cnt++; + if (fmt[i - 1] == 's') { + switch (fmt_cnt) { + case 1: + probe_read_string(buf, sizeof(buf), r3); + r3 = (long) buf; + break; + case 2: + probe_read_string(buf, sizeof(buf), r4); + r4 = (long) buf; + break; + case 3: + probe_read_string(buf, sizeof(buf), r5); + r5 = (long) buf; + break; + } + } continue; }
%s specifier makes bpf program and kernel debugging easier. To make sure that trace_printk won't crash the unsafe string is copied into stack and unsafe pointer is substituted. String is also sanitized for printable characters. The cost of swapping FS in probe_kernel_read is amortized over 4 chars to improve performance. Suggested-by: Brendan Gregg <brendan.d.gregg@gmail.com> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> --- The following C program: #include <linux/fs.h> int foo(struct pt_regs *ctx, struct filename *filename) { void *name = 0; bpf_probe_read(&name, sizeof(name), &filename->name); bpf_trace_printk("executed %s\\n", name); return 0; } when attached to kprobe do_execve() will produce output in /sys/kernel/debug/tracing/trace_pipe : make-13492 [002] d..1 3250.997277: : executed /bin/sh sh-13493 [004] d..1 3250.998716: : executed /usr/bin/gcc gcc-13494 [002] d..1 3250.999822: : executed /usr/lib/gcc/x86_64-linux-gnu/4.7/cc1 gcc-13495 [002] d..1 3251.006731: : executed /usr/bin/as gcc-13496 [002] d..1 3251.011831: : executed /usr/lib/gcc/x86_64-linux-gnu/4.7/collect2 collect2-13497 [000] d..1 3251.012941: : executed /usr/bin/ld kernel/trace/bpf_trace.c | 60 +++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 57 insertions(+), 3 deletions(-)