Message ID | 20091013152235.188059d2@marrow.netinsight.se |
---|---|
State | RFC |
Headers | show |
On Tue, 13 Oct 2009, Simon Kagstrom wrote: > + > +struct dump_device { I know I used that name myself in the example code I sent out, but thinking about it more, I hate the name. Most people think a "dump" is something much more than just the kernel messages, so "dump_device" ends up being misleading. And the "device" part is kind of pointless and redundant (it need not be a real device). So I suspect it would be better to name it by what it does, and make it clear what that is. Maybe something like "struct kmsg_dumper" or similar. That is pretty unambiguous, and nobody is going to get confused about what the semantics are. > + void (*oops)(struct dump_device *, const char *, unsigned long, > + const char *, unsigned long); > + void (*panic)(struct dump_device *, const char *, unsigned long, > + const char *, unsigned long); I don't much like this. Why separate 'oops' and 'panic' functions, especially since we migth have many more causes to do a kmsg dump in the future (ie 'shutdown', 'sysrq', 'suspend' etc etc)? So separating them out as two different functions is just wrong. Make it one function, and then perhaps you can add a enum kmsg_dump_reason { kmsg_dump_panic, kmsg_dump_oops, .. }; and pass it as an argument. > + int (*setup)(struct dump_device *); Why? There seems to be no reason for this. Who ever registers the dumper should just do the setup call itself. Why would we have a callback that just gets called immediately, rather than have the registration code just do the call itself? > +int register_dumpdevice(struct dump_device *dump, void *priv) > +{ > + /* We need at least one of these set */ > + if (!dump->oops && !dump->panic) > + return -EINVAL; > + if (dump->setup && dump->setup(dump) != 0) > + return -EINVAL; So the above two tests should be pointless. > + dump->priv = priv; > + > + INIT_LIST_HEAD(&dump->list); Don't do "INIT_LIST_HEAD()" here. It's pointless as far as I can tell (the list_add() will initialize it), but even in general we should basically have basic initialization done by callers if needed. And judging by historical problems in areas like that, we should protect against people registering the same dumper twice. One way to do that would be to perhaps _require_ that the caller has initialized it, and then do a if (!list_empty(&dump->list)) return -EBUSY; (but I could also see using just a "registered" flag) > + write_lock(&dump_list_lock); This looks dubious. Dumping can obviously happen from interrupts, so _if_ you were to protect against dumpers, you'd need to use an interrupt-safe lock. Of course, you do not actually take the lock at dump time (which may be intentional, and that is not necessarily wrong - taking locks at oops time is generally not a good thing to do, and it may be entirely reasonable to say "we take the risk of not locking properly in order to _maybe_ work even if the lock is scrogged"). But if you don't take the lock at dump time (or, perhaps preferably, if you make the dump-time lock be a "try_lock()" - maybe the oops is due to dump list corruption, and if the dump_list_lock is held thew oopser should simply not dump!), then you should probably use a spinlock rather than an rwlock. > + list_for_each_entry(dump, &dump_list, list) { > + if (panic && dump->panic) > + dump->panic(dump, s1, l1, s2, l2); > + else if (!panic && dump->oops) > + dump->oops(dump, s1, l1, s2, l2); > + } So this would just become list_for_each_entry(dump, &dump_list, list) dump->fn(dump, s1, l1, s2, l2, reason); or something. Linus
Just stumbled across this patch. On Tue, 13 October 2009 15:22:35 +0200, Simon Kagstrom wrote: > +void dump_kmsg(int panic) > +{ > + unsigned long len = ACCESS_ONCE(log_end); > + struct dump_device *dump; > + const char *s1, *s2; > + unsigned long l1, l2; > + > + s1 = ""; > + l1 = 0; > + s2 = log_buf; > + l2 = len; > + > + /* Have we rotated around the circular buffer? */ > + if (len > log_buf_len) { I believe this bit is wrong. log_end is an unsigned int, so it can wrap relatively quickly. If log_end just wrapped to 0 and dump_kmsg is called, the amount of printk buffer stored appears to be 0 as well. To avoid this case one could either declare log_end and len as u64, causing extra computational costs. Or one could just use the conditional code below unconditionally. That could write random or zeroed printk buffer directly after bootup, but would never miss information. > + unsigned long pos = (len & LOG_BUF_MASK); > + > + s1 = log_buf + pos; > + l1 = log_buf_len - pos; > + > + s2 = log_buf; > + l2 = pos; > + } > + > + list_for_each_entry(dump, &dump_list, list) { > + if (panic && dump->panic) > + dump->panic(dump, s1, l1, s2, l2); > + else if (!panic && dump->oops) > + dump->oops(dump, s1, l1, s2, l2); > + } > +} > -- > 1.6.0.4 Jörn
On Thu, 2009-11-26 at 10:36 +0100, Jörn Engel wrote: > Just stumbled across this patch. > > On Tue, 13 October 2009 15:22:35 +0200, Simon Kagstrom wrote: > > +void dump_kmsg(int panic) > > +{ > > + unsigned long len = ACCESS_ONCE(log_end); > > + struct dump_device *dump; > > + const char *s1, *s2; > > + unsigned long l1, l2; > > + > > + s1 = ""; > > + l1 = 0; > > + s2 = log_buf; > > + l2 = len; > > + > > + /* Have we rotated around the circular buffer? */ > > + if (len > log_buf_len) { > > I believe this bit is wrong. log_end is an unsigned int, so it can > wrap relatively quickly. If log_end just wrapped to 0 and dump_kmsg is > called, the amount of printk buffer stored appears to be 0 as well. To me it looks like 'log_end' is not supposed to wrap. What makes you think it can? In which cases it can?
On Mon, 30 November 2009 09:27:51 +0200, Artem Bityutskiy wrote: > > To me it looks like 'log_end' is not supposed to wrap. What makes you > think it can? In which cases it can? It is a 32bit variable. Would do you expect happens once you reach 0xffffffff and add 1? Jörn
On Thu, 2009-11-26 at 10:36 +0100, Jörn Engel wrote: > Just stumbled across this patch. > > On Tue, 13 October 2009 15:22:35 +0200, Simon Kagstrom wrote: > > +void dump_kmsg(int panic) > > +{ > > + unsigned long len = ACCESS_ONCE(log_end); > > + struct dump_device *dump; > > + const char *s1, *s2; > > + unsigned long l1, l2; > > + > > + s1 = ""; > > + l1 = 0; > > + s2 = log_buf; > > + l2 = len; > > + > > + /* Have we rotated around the circular buffer? */ > > + if (len > log_buf_len) { > > I believe this bit is wrong. log_end is an unsigned int, so it can > wrap relatively quickly. If log_end just wrapped to 0 and dump_kmsg is > called, the amount of printk buffer stored appears to be 0 as well. Simon, if you want your patches to be merged to 2.6.33, we should address this issue ASAP, otherwise it'll be too late.
On Mon, 30 Nov 2009 11:09:47 +0200 Artem Bityutskiy <dedekind1@gmail.com> wrote: > > I believe this bit is wrong. log_end is an unsigned int, so it can > > wrap relatively quickly. If log_end just wrapped to 0 and dump_kmsg is > > called, the amount of printk buffer stored appears to be 0 as well. > > Simon, if you want your patches to be merged to 2.6.33, we should > address this issue ASAP, otherwise it'll be too late. I'm testing your patch, I'll get back when I have verified that it works. // Simon
diff --git a/include/linux/dump_device.h b/include/linux/dump_device.h new file mode 100644 index 0000000..8c3f378 --- /dev/null +++ b/include/linux/dump_device.h @@ -0,0 +1,33 @@ +/* + * linux/include/dump_device.h + * + * Copyright (C) 2009 Net Insight + * + * Author: Simon Kagstrom <simon.kagstrom@netinsight.net> + * + * This file is subject to the terms and conditions of the GNU General Public + * License. See the file COPYING in the main directory of this archive + * for more details. + */ +#ifndef _LINUX_DUMP_DEVICE_H +#define _LINUX_DUMP_DEVICE_H + +#include <linux/list.h> + +struct dump_device { + void (*oops)(struct dump_device *, const char *, unsigned long, + const char *, unsigned long); + void (*panic)(struct dump_device *, const char *, unsigned long, + const char *, unsigned long); + int (*setup)(struct dump_device *); + void *priv; + struct list_head list; +}; + +void dump_kmsg(int panic); + +int register_dumpdevice(struct dump_device *dump, void *priv); + +void unregister_dumpdevice(struct dump_device *dump); + +#endif /* _LINUX_DUMP_DEVICE_H */ diff --git a/kernel/panic.c b/kernel/panic.c index c0b33b8..e7dbf2b 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -8,6 +8,7 @@ * This function is used through-out the kernel (including mm and fs) * to indicate a major problem. */ +#include <linux/dump_device.h> #include <linux/debug_locks.h> #include <linux/interrupt.h> #include <linux/kallsyms.h> @@ -76,6 +77,7 @@ NORET_TYPE void panic(const char * fmt, ...) dump_stack(); #endif + dump_kmsg(1); /* * If we have crashed and we have a crash kernel loaded let it handle * everything else. @@ -341,6 +343,7 @@ void oops_exit(void) { do_oops_enter_exit(); print_oops_end_marker(); + dump_kmsg(0); } #ifdef WANT_WARN_ON_SLOWPATH diff --git a/kernel/printk.c b/kernel/printk.c index f38b07f..5e1fe73 100644 --- a/kernel/printk.c +++ b/kernel/printk.c @@ -33,6 +33,7 @@ #include <linux/bootmem.h> #include <linux/syscalls.h> #include <linux/kexec.h> +#include <linux/dump_device.h> #include <asm/uaccess.h> @@ -1405,3 +1406,85 @@ bool printk_timed_ratelimit(unsigned long *caller_jiffies, } EXPORT_SYMBOL(printk_timed_ratelimit); #endif + +static LIST_HEAD(dump_list); +static DEFINE_RWLOCK(dump_list_lock); + +/** + * register_dump_device - register a dumpdevice. + * @dump: pointer to the dump structure + * @priv: private data for the structure + * + * Adds a dump device to the system. The oops and panic callbacks + * in the structure will be called when the kernel oopses and panics + * respectively. At least one of these must be set. Returns zero on + * success and -EINVAL otherwise. + */ +int register_dumpdevice(struct dump_device *dump, void *priv) +{ + /* We need at least one of these set */ + if (!dump->oops && !dump->panic) + return -EINVAL; + if (dump->setup && dump->setup(dump) != 0) + return -EINVAL; + dump->priv = priv; + + INIT_LIST_HEAD(&dump->list); + write_lock(&dump_list_lock); + list_add(&dump->list, &dump_list); + write_unlock(&dump_list_lock); + return 0; +} +EXPORT_SYMBOL(register_dumpdevice); + +/** + * unregister_dump_device - unregister a dumpdevice. + * @dump: pointer to the dump structure + * + * Removes a dump device from the system. + */ +void unregister_dumpdevice(struct dump_device *dump) +{ + write_lock(&dump_list_lock); + list_del(&dump->list); + write_unlock(&dump_list_lock); +} +EXPORT_SYMBOL(unregister_dumpdevice); + +/** + * dump_kmsg - dump kernel log to dump devices. + * @panic: if this is a panic (1) or an oops (0) + * + * Iterate through each of the dump devices and call the oops/panic + * callbacks with the log buffer. + */ +void dump_kmsg(int panic) +{ + unsigned long len = ACCESS_ONCE(log_end); + struct dump_device *dump; + const char *s1, *s2; + unsigned long l1, l2; + + s1 = ""; + l1 = 0; + s2 = log_buf; + l2 = len; + + /* Have we rotated around the circular buffer? */ + if (len > log_buf_len) { + unsigned long pos = (len & LOG_BUF_MASK); + + s1 = log_buf + pos; + l1 = log_buf_len - pos; + + s2 = log_buf; + l2 = pos; + } + + list_for_each_entry(dump, &dump_list, list) { + if (panic && dump->panic) + dump->panic(dump, s1, l1, s2, l2); + else if (!panic && dump->oops) + dump->oops(dump, s1, l1, s2, l2); + } +}
The core functionality is implemented as per Linus suggestion from http://lists.infradead.org/pipermail/linux-mtd/2009-October/027620.html a struct dump_device type has been added which contains callbacks to dump the kernel log buffers on crashes. The dump_kmsg function gets called from oops_exit() and panic() and invokes these callbacks. Signed-off-by: Simon Kagstrom <simon.kagstrom@netinsight.net> --- include/linux/dump_device.h | 33 +++++++++++++++++ kernel/panic.c | 3 ++ kernel/printk.c | 83 +++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 119 insertions(+), 0 deletions(-) create mode 100644 include/linux/dump_device.h