Patchwork [01/40] ftrace syscalls: don't add events for unmapped syscalls

login
register
mail settings
Submitter Ian Munsie
Date June 23, 2010, 10:02 a.m.
Message ID <1277287401-28571-2-git-send-email-imunsie@au1.ibm.com>
Download mbox | patch
Permalink /patch/56627/
State Not Applicable
Headers show

Comments

Ian Munsie - June 23, 2010, 10:02 a.m.
From: Ian Munsie <imunsie@au1.ibm.com>

FTRACE_SYSCALLS would create events for each and every system call, even
if it had failed to map the system call's name with it's number. This
resulted in a number of events being created that would not behave as
expected.

This could happen, for example, on architectures who's symbol names are
unusual and will not match the system call name. It could also happen
with system calls which were mapped to sys_ni_syscall.

This patch changes the default system call number in the metadata to -1.
If the system call name from the metadata is not successfully mapped to
a system call number during boot, than the event initialisation routine
will now return an error, preventing the event from being created.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
---
 include/linux/syscalls.h      |    2 ++
 kernel/trace/trace_syscalls.c |    8 ++++++++
 2 files changed, 10 insertions(+), 0 deletions(-)
Steven Rostedt - June 23, 2010, 3:02 p.m.
On Wed, 2010-06-23 at 20:02 +1000, Ian Munsie wrote:
> From: Ian Munsie <imunsie@au1.ibm.com>
> 
> FTRACE_SYSCALLS would create events for each and every system call, even
> if it had failed to map the system call's name with it's number. This
> resulted in a number of events being created that would not behave as
> expected.
> 
> This could happen, for example, on architectures who's symbol names are
> unusual and will not match the system call name. It could also happen
> with system calls which were mapped to sys_ni_syscall.
> 
> This patch changes the default system call number in the metadata to -1.
> If the system call name from the metadata is not successfully mapped to
> a system call number during boot, than the event initialisation routine
> will now return an error, preventing the event from being created.
> 
> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
> ---
>  include/linux/syscalls.h      |    2 ++
>  kernel/trace/trace_syscalls.c |    8 ++++++++
>  2 files changed, 10 insertions(+), 0 deletions(-)
> 
> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> index 7f614ce..86f082b 100644
> --- a/include/linux/syscalls.h
> +++ b/include/linux/syscalls.h
> @@ -159,6 +159,7 @@ extern struct trace_event_functions exit_syscall_print_funcs;
>  	  __attribute__((section("__syscalls_metadata")))	\
>  	  __syscall_meta_##sname = {				\
>  		.name 		= "sys"#sname,			\
> +		.syscall_nr	= -1,	/* Filled in at boot */	\
>  		.nb_args 	= nb,				\
>  		.types		= types_##sname,		\
>  		.args		= args_##sname,			\
> @@ -176,6 +177,7 @@ extern struct trace_event_functions exit_syscall_print_funcs;
>  	  __attribute__((section("__syscalls_metadata")))	\
>  	  __syscall_meta__##sname = {				\
>  		.name 		= "sys_"#sname,			\
> +		.syscall_nr	= -1,	/* Filled in at boot */	\
>  		.nb_args 	= 0,				\
>  		.enter_event	= &event_enter__##sname,	\
>  		.exit_event	= &event_exit__##sname,		\
> diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
> index 34e3580..82246ce 100644
> --- a/kernel/trace/trace_syscalls.c
> +++ b/kernel/trace/trace_syscalls.c
> @@ -431,6 +431,14 @@ void unreg_event_syscall_exit(struct ftrace_event_call *call)
>  int init_syscall_trace(struct ftrace_event_call *call)
>  {
>  	int id;
> +	int num;
> +
> +	num = ((struct syscall_metadata *)call->data)->syscall_nr;
> +	if (num < 0 || num >= NR_syscalls) {
> +		pr_debug("syscall %s metadata not mapped, disabling ftrace event\n",
> +				((struct syscall_metadata *)call->data)->name);
> +		return -ENOSYS;
> +	}

Perhaps this should be:

	if (WARN_ON_ONCE(num < 0 || num >= NR_syscalls)
		return -ENOSYS;

-- Steve

>  
>  	if (set_syscall_print_fmt(call) < 0)
>  		return -ENOMEM;
Ian Munsie - June 29, 2010, 1:18 a.m.
Excerpts from Steven Rostedt's message of Thu Jun 24 01:02:19 +1000 2010:
> > diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
> > index 34e3580..82246ce 100644
> > --- a/kernel/trace/trace_syscalls.c
> > +++ b/kernel/trace/trace_syscalls.c
> > @@ -431,6 +431,14 @@ void unreg_event_syscall_exit(struct ftrace_event_call *call)
> >  int init_syscall_trace(struct ftrace_event_call *call)
> >  {
> >      int id;
> > +    int num;
> > +
> > +    num = ((struct syscall_metadata *)call->data)->syscall_nr;
> > +    if (num < 0 || num >= NR_syscalls) {
> > +        pr_debug("syscall %s metadata not mapped, disabling ftrace event\n",
> > +                ((struct syscall_metadata *)call->data)->name);
> > +        return -ENOSYS;
> > +    }
> 
> Perhaps this should be:
> 
>     if (WARN_ON_ONCE(num < 0 || num >= NR_syscalls)
>         return -ENOSYS;
> 
> -- Steve


Hi Steven,

I don't think this should produce a warning - it signifies that a
syscall defined in the code has not successfully matched a syscall at
boot.

That may signify the matching failed, but it may just as likely signify
that the syscall isn't used on that arch (for instance, if an arch uses
an arch specific implementation in favour of a generic implementation,
or doesn't implement a particular syscall at all that is defined in the
generic code).

The problem case is actually the other way around - when a syscall
number from an arch's syscall table has not successfully mapped to some
syscall meta-data. Patch #37 prints out those cases with KERN_DEBUG so
booting a kernel with loglevel=8 can track them down.

This pr_debug may still be useful, for example to help locate unused
syscalls which can be removed if no arch uses them.

Cheers,
-Ian

Patch

diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 7f614ce..86f082b 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -159,6 +159,7 @@  extern struct trace_event_functions exit_syscall_print_funcs;
 	  __attribute__((section("__syscalls_metadata")))	\
 	  __syscall_meta_##sname = {				\
 		.name 		= "sys"#sname,			\
+		.syscall_nr	= -1,	/* Filled in at boot */	\
 		.nb_args 	= nb,				\
 		.types		= types_##sname,		\
 		.args		= args_##sname,			\
@@ -176,6 +177,7 @@  extern struct trace_event_functions exit_syscall_print_funcs;
 	  __attribute__((section("__syscalls_metadata")))	\
 	  __syscall_meta__##sname = {				\
 		.name 		= "sys_"#sname,			\
+		.syscall_nr	= -1,	/* Filled in at boot */	\
 		.nb_args 	= 0,				\
 		.enter_event	= &event_enter__##sname,	\
 		.exit_event	= &event_exit__##sname,		\
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index 34e3580..82246ce 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -431,6 +431,14 @@  void unreg_event_syscall_exit(struct ftrace_event_call *call)
 int init_syscall_trace(struct ftrace_event_call *call)
 {
 	int id;
+	int num;
+
+	num = ((struct syscall_metadata *)call->data)->syscall_nr;
+	if (num < 0 || num >= NR_syscalls) {
+		pr_debug("syscall %s metadata not mapped, disabling ftrace event\n",
+				((struct syscall_metadata *)call->data)->name);
+		return -ENOSYS;
+	}
 
 	if (set_syscall_print_fmt(call) < 0)
 		return -ENOMEM;