mbox series

[v2,0/5] allow ramoops to collect all kmesg_dump events

Message ID 20200505154510.93506-1-pasha.tatashin@soleen.com
Headers show
Series allow ramoops to collect all kmesg_dump events | expand

Message

Pasha Tatashin May 5, 2020, 3:45 p.m. UTC
pstore /mnt/console-ramoops-0 outputs only messages below the console
loglevel, and our console loglevel is set to 3 due to slowness of
serial console. Which means only errors and worse types of messages
are recorded. There is no way to have different log levels for
different consoles.

This patch series adds a new option to ramoops: max_reason that enables
it to collect kmdesg dumps for other reasons beside oops and panics.

How to quickly test:

virtme-run --mods=auto --kdir --mods=auto --kdir . \
	-a memmap=1G$8G -a ramoops.mem_address=0x200000000 \
	-a ramoops.mem_size=0x100000 -a ramoops.record_size=32768 \
	-a ramoops.max_reason=5 -a quiet --qemu-opts -m 8G
..
# reboot -f

After VM is back:

# mount -t pstore pstore /mnt
# head /mnt/dmesg-ramoops-0 
Restart#1 Part1
...

Changelog:

v1:
https://lore.kernel.org/lkml/20200502143555.543636-1-pasha.tatashin@soleen.com

v2:
Addressed comments from Kees Cook, Steven Rostedt, and Sergey Senozhatsky
 - Replaced dump_all with max_reason
 - removed duplicated enum value
 - moved always_kmsg_dump logic back to kmsg_dump().

Pavel Tatashin (5):
  printk: honor the max_reason field in kmsg_dumper
  pstore/platform: pass max_reason to kmesg dump
  pstore/ram: in ramoops_platform_data convert dump_oops to max_reason
  pstore/ram: allow to dump kmesg during regular reboot
  ramoops: add max_reason optional field to ramoops DT node

 Documentation/admin-guide/ramoops.rst         | 11 +++---
 .../bindings/reserved-memory/ramoops.txt      | 10 ++++--
 drivers/platform/chrome/chromeos_pstore.c     |  2 +-
 fs/pstore/platform.c                          |  4 ++-
 fs/pstore/ram.c                               | 35 +++++++++----------
 include/linux/kmsg_dump.h                     |  1 +
 include/linux/pstore.h                        |  3 ++
 include/linux/pstore_ram.h                    |  2 +-
 kernel/printk/printk.c                        | 15 +++++---
 9 files changed, 51 insertions(+), 32 deletions(-)

Comments

Kees Cook May 5, 2020, 9:59 p.m. UTC | #1
On Tue, May 05, 2020 at 11:45:07AM -0400, Pavel Tatashin wrote:
> Add a new field to pstore_info that passes information about kmesg dump
> maximum reason.
> 
> This allows a finer control of what kmesg dumps are stored on pstore
> device.
> 
> Those clients that do not explicitly set this field (keep it equal to 0),
> get the default behavior: dump only Oops and Panics, and dump everything
> if printk.always_kmsg_dump is provided.
> 
> Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
> ---
>  fs/pstore/platform.c   | 4 +++-
>  include/linux/pstore.h | 3 +++
>  2 files changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
> index 408277ee3cdb..75bf8a43f92a 100644
> --- a/fs/pstore/platform.c
> +++ b/fs/pstore/platform.c
> @@ -602,8 +602,10 @@ int pstore_register(struct pstore_info *psi)
>  	if (pstore_is_mounted())
>  		pstore_get_records(0);
>  
> -	if (psi->flags & PSTORE_FLAGS_DMESG)
> +	if (psi->flags & PSTORE_FLAGS_DMESG) {
> +		pstore_dumper.max_reason = psinfo->max_reason;
>  		pstore_register_kmsg();
> +	}

I haven't finished reading the whole series carefully, but I think
something we can do here to make things a bit more user-friendly is to
do the KMSG_DUMP_UNDEF value here to get us the "all" instead of INT_MAX:

	if (psi->flags & PSTORE_FLAGS_DMESG) {
		pstore_dumper.max_reason = psinfo->max_reason;
		if (pstore_dumper.max_reason == KMSG_DUMP_UNDEF)
			pstore_dumper.max_reason = KMSG_DUMP_MAX;
		pstore_register_kmsg();
	}

That way setting max_reason to 0 without setting dump_oops at all will
get "all". I think it'll need some tweaks to the next patch.

>  	if (psi->flags & PSTORE_FLAGS_CONSOLE)
>  		pstore_register_console();
>  	if (psi->flags & PSTORE_FLAGS_FTRACE)
> diff --git a/include/linux/pstore.h b/include/linux/pstore.h
> index e779441e6d26..45ae424bfeb5 100644
> --- a/include/linux/pstore.h
> +++ b/include/linux/pstore.h
> @@ -97,6 +97,8 @@ struct pstore_record {
>   * @read_mutex:	serializes @open, @read, @close, and @erase callbacks
>   * @flags:	bitfield of frontends the backend can accept writes for
>   * @data:	backend-private pointer passed back during callbacks
> + * @max_reason: Used when PSTORE_FLAGS_DMESG is set. Contains the
> + *              kmsg_dump_reason enum value.

Nit: please move this above @data since it has a @flags dependency.

>   *
>   * Callbacks:
>   *
> @@ -180,6 +182,7 @@ struct pstore_info {
>  
>  	int		flags;
>  	void		*data;
> +	int		max_reason;
>  
>  	int		(*open)(struct pstore_info *psi);
>  	int		(*close)(struct pstore_info *psi);
> -- 
> 2.25.1
> 

Looking good!
Kees Cook May 5, 2020, 11:15 p.m. UTC | #2
On Tue, May 05, 2020 at 11:45:08AM -0400, Pavel Tatashin wrote:
> Now, that pstore_register() can correctly pass max_reason to kmesg dump
> facility, use it instead of dump_oops boolean.
> 
> Replace in ramoops_platform_data dump_oops with max_reason. When dump_oops
> was enabled set max_reason to KMSG_DUMP_OOPS, otherwise set it to
> KMSG_DUMP_PANIC.
> 
> Remove filtering logic from ramoops_pstore_write(), as that is not needed
> anymore, only dmesges specified by max_reason are passed to
> ramoops_pstore_write(). Also, because of this, we can remove
> cxt->dump_oops.

This is all looking good. I think I'd like to see patch 3 and 4 merged,
though. I'd like to make the dump_oops/max_reason conversion in one
patch. Noted below...

> 
> Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
> ---
>  Documentation/admin-guide/ramoops.rst     | 11 ++++++----
>  drivers/platform/chrome/chromeos_pstore.c |  2 +-
>  fs/pstore/ram.c                           | 26 +++++++----------------
>  include/linux/pstore_ram.h                |  2 +-
>  4 files changed, 17 insertions(+), 24 deletions(-)
> 
> diff --git a/Documentation/admin-guide/ramoops.rst b/Documentation/admin-guide/ramoops.rst
> index 6dbcc5481000..a296e1aa1617 100644
> --- a/Documentation/admin-guide/ramoops.rst
> +++ b/Documentation/admin-guide/ramoops.rst
> @@ -32,11 +32,14 @@ memory to be mapped strongly ordered, and atomic operations on strongly ordered
>  memory are implementation defined, and won't work on many ARMs such as omaps.
>  
>  The memory area is divided into ``record_size`` chunks (also rounded down to
> -power of two) and each oops/panic writes a ``record_size`` chunk of
> +power of two) and each kmesg dump writes a ``record_size`` chunk of
>  information.
>  
> -Dumping both oopses and panics can be done by setting 1 in the ``dump_oops``
> -variable while setting 0 in that variable dumps only the panics.
> +Dumping reasons can be set via max_reason value, as defined in
> +include/linux/kmsg_dump.h: kmsg_dump_reason. For example, to
> +dump for both oopses and panics reasons, max_reason should be set to 2
> +(KMSG_DUMP_OOPS), to dump panics only max_reason should be set to 1
> +(KMSG_DUMP_PANIC).
>  
>  The module uses a counter to record multiple dumps but the counter gets reset
>  on restart (i.e. new dumps after the restart will overwrite old ones).
> @@ -90,7 +93,7 @@ Setting the ramoops parameters can be done in several different manners:
>          .mem_address            = <...>,
>          .mem_type               = <...>,
>          .record_size            = <...>,
> -        .dump_oops              = <...>,
> +        .max_reason             = <...>,
>          .ecc                    = <...>,
>    };

Good, yes, dump_oops should be removed from the platform data structure
since that's an entirely internal API.

>  
> diff --git a/drivers/platform/chrome/chromeos_pstore.c b/drivers/platform/chrome/chromeos_pstore.c
> index d13770785fb5..fa51153688b4 100644
> --- a/drivers/platform/chrome/chromeos_pstore.c
> +++ b/drivers/platform/chrome/chromeos_pstore.c
> @@ -57,7 +57,7 @@ static struct ramoops_platform_data chromeos_ramoops_data = {
>  	.record_size	= 0x40000,
>  	.console_size	= 0x20000,
>  	.ftrace_size	= 0x20000,
> -	.dump_oops	= 1,
> +	.max_reason	= KMSG_DUMP_OOPS,
>  };
>  
>  static struct platform_device chromeos_ramoops = {
> diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c
> index 795622190c01..223581aeea96 100644
> --- a/fs/pstore/ram.c
> +++ b/fs/pstore/ram.c
> @@ -81,7 +81,6 @@ struct ramoops_context {
>  	size_t console_size;
>  	size_t ftrace_size;
>  	size_t pmsg_size;
> -	int dump_oops;
>  	u32 flags;
>  	struct persistent_ram_ecc_info ecc_info;
>  	unsigned int max_dump_cnt;
> @@ -381,18 +380,6 @@ static int notrace ramoops_pstore_write(struct pstore_record *record)
>  	if (record->type != PSTORE_TYPE_DMESG)
>  		return -EINVAL;
>  
> -	/*
> -	 * Out of the various dmesg dump types, ramoops is currently designed
> -	 * to only store crash logs, rather than storing general kernel logs.
> -	 */
> -	if (record->reason != KMSG_DUMP_OOPS &&
> -	    record->reason != KMSG_DUMP_PANIC)
> -		return -EINVAL;
> -
> -	/* Skip Oopes when configured to do so. */
> -	if (record->reason == KMSG_DUMP_OOPS && !cxt->dump_oops)
> -		return -EINVAL;
> -
>  	/*
>  	 * Explicitly only take the first part of any new crash.
>  	 * If our buffer is larger than kmsg_bytes, this can never happen,
> @@ -687,7 +674,7 @@ static int ramoops_parse_dt(struct platform_device *pdev,
>  	pdata->mem_size = resource_size(res);
>  	pdata->mem_address = res->start;
>  	pdata->mem_type = of_property_read_bool(of_node, "unbuffered");
> -	pdata->dump_oops = !of_property_read_bool(of_node, "no-dump-oops");
> +	dump_oops = !of_property_read_bool(of_node, "no-dump-oops");
:
Is this setting the module param variable? That shouldn't happen here --
we may fail the DT and overwrite the user-configured setting for a
different backend. This should be a local variable and the "final"
max_reason should be calculated in this function, I think.

>  
>  #define parse_size(name, field) {					\
>  		ret = ramoops_parse_dt_size(pdev, name, &value);	\
> @@ -785,7 +772,6 @@ static int ramoops_probe(struct platform_device *pdev)
>  	cxt->console_size = pdata->console_size;
>  	cxt->ftrace_size = pdata->ftrace_size;
>  	cxt->pmsg_size = pdata->pmsg_size;
> -	cxt->dump_oops = pdata->dump_oops;
>  	cxt->flags = pdata->flags;
>  	cxt->ecc_info = pdata->ecc_info;
>  
> @@ -828,8 +814,14 @@ static int ramoops_probe(struct platform_device *pdev)
>  	 * the single region size is how to check.
>  	 */
>  	cxt->pstore.flags = 0;
> -	if (cxt->max_dump_cnt)
> +	if (cxt->max_dump_cnt) {
>  		cxt->pstore.flags |= PSTORE_FLAGS_DMESG;
> +		if (pdata->max_reason <= 0) {
> +			pdata->max_reason = dump_oops ? KMSG_DUMP_OOPS :
> +							KMSG_DUMP_PANIC;
> +		}
> +		cxt->pstore.max_reason = pdata->max_reason;
> +	}

I'm going to take a stab at reorganizing the DT, platform data, and
module args to have default handling done in a way that I like. I'm
having a hard time making specific suggestions here. :)

>  	if (cxt->console_size)
>  		cxt->pstore.flags |= PSTORE_FLAGS_CONSOLE;
>  	if (cxt->max_ftrace_cnt)
> @@ -865,7 +857,6 @@ static int ramoops_probe(struct platform_device *pdev)
>  	mem_size = pdata->mem_size;
>  	mem_address = pdata->mem_address;
>  	record_size = pdata->record_size;
> -	dump_oops = pdata->dump_oops;
>  	ramoops_console_size = pdata->console_size;
>  	ramoops_pmsg_size = pdata->pmsg_size;
>  	ramoops_ftrace_size = pdata->ftrace_size;
> @@ -948,7 +939,6 @@ static void __init ramoops_register_dummy(void)
>  	pdata.console_size = ramoops_console_size;
>  	pdata.ftrace_size = ramoops_ftrace_size;
>  	pdata.pmsg_size = ramoops_pmsg_size;
> -	pdata.dump_oops = dump_oops;
>  	pdata.flags = RAMOOPS_FLAG_FTRACE_PER_CPU;
>  
>  	/*
> diff --git a/include/linux/pstore_ram.h b/include/linux/pstore_ram.h
> index 9cb9b9067298..9f16afec7290 100644
> --- a/include/linux/pstore_ram.h
> +++ b/include/linux/pstore_ram.h
> @@ -133,7 +133,7 @@ struct ramoops_platform_data {
>  	unsigned long	console_size;
>  	unsigned long	ftrace_size;
>  	unsigned long	pmsg_size;
> -	int		dump_oops;
> +	int		max_reason;
>  	u32		flags;
>  	struct persistent_ram_ecc_info ecc_info;
>  };
> -- 
> 2.25.1
> 

So, hold off on a v3, and I'll send a series tomorrow, based on what
you've got here for v2. I like the refactoring; it's much cleaner to
have max_reason than dump_oops! :)
Pasha Tatashin May 6, 2020, 1:42 p.m. UTC | #3
> > Remove filtering logic from ramoops_pstore_write(), as that is not needed
> > anymore, only dmesges specified by max_reason are passed to
> > ramoops_pstore_write(). Also, because of this, we can remove
> > cxt->dump_oops.
>
> This is all looking good. I think I'd like to see patch 3 and 4 merged,
> though. I'd like to make the dump_oops/max_reason conversion in one
> patch. Noted below...

Sure, I can do it.

>
> >
> > Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
> > ---
> >  Documentation/admin-guide/ramoops.rst     | 11 ++++++----
> >  drivers/platform/chrome/chromeos_pstore.c |  2 +-
> >  fs/pstore/ram.c                           | 26 +++++++----------------
> >  include/linux/pstore_ram.h                |  2 +-
> >  4 files changed, 17 insertions(+), 24 deletions(-)
> >
> > diff --git a/Documentation/admin-guide/ramoops.rst b/Documentation/admin-guide/ramoops.rst
> > index 6dbcc5481000..a296e1aa1617 100644
> > --- a/Documentation/admin-guide/ramoops.rst
> > +++ b/Documentation/admin-guide/ramoops.rst
> > @@ -32,11 +32,14 @@ memory to be mapped strongly ordered, and atomic operations on strongly ordered
> >  memory are implementation defined, and won't work on many ARMs such as omaps.
> >
> >  The memory area is divided into ``record_size`` chunks (also rounded down to
> > -power of two) and each oops/panic writes a ``record_size`` chunk of
> > +power of two) and each kmesg dump writes a ``record_size`` chunk of
> >  information.
> >
> > -Dumping both oopses and panics can be done by setting 1 in the ``dump_oops``
> > -variable while setting 0 in that variable dumps only the panics.
> > +Dumping reasons can be set via max_reason value, as defined in
> > +include/linux/kmsg_dump.h: kmsg_dump_reason. For example, to
> > +dump for both oopses and panics reasons, max_reason should be set to 2
> > +(KMSG_DUMP_OOPS), to dump panics only max_reason should be set to 1
> > +(KMSG_DUMP_PANIC).
> >
> >  The module uses a counter to record multiple dumps but the counter gets reset
> >  on restart (i.e. new dumps after the restart will overwrite old ones).
> > @@ -90,7 +93,7 @@ Setting the ramoops parameters can be done in several different manners:
> >          .mem_address            = <...>,
> >          .mem_type               = <...>,
> >          .record_size            = <...>,
> > -        .dump_oops              = <...>,
> > +        .max_reason             = <...>,
> >          .ecc                    = <...>,
> >    };
>
> Good, yes, dump_oops should be removed from the platform data structure
> since that's an entirely internal API.
>
> >
> > diff --git a/drivers/platform/chrome/chromeos_pstore.c b/drivers/platform/chrome/chromeos_pstore.c
> > index d13770785fb5..fa51153688b4 100644
> > --- a/drivers/platform/chrome/chromeos_pstore.c
> > +++ b/drivers/platform/chrome/chromeos_pstore.c
> > @@ -57,7 +57,7 @@ static struct ramoops_platform_data chromeos_ramoops_data = {
> >       .record_size    = 0x40000,
> >       .console_size   = 0x20000,
> >       .ftrace_size    = 0x20000,
> > -     .dump_oops      = 1,
> > +     .max_reason     = KMSG_DUMP_OOPS,
> >  };
> >
> >  static struct platform_device chromeos_ramoops = {
> > diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c
> > index 795622190c01..223581aeea96 100644
> > --- a/fs/pstore/ram.c
> > +++ b/fs/pstore/ram.c
> > @@ -81,7 +81,6 @@ struct ramoops_context {
> >       size_t console_size;
> >       size_t ftrace_size;
> >       size_t pmsg_size;
> > -     int dump_oops;
> >       u32 flags;
> >       struct persistent_ram_ecc_info ecc_info;
> >       unsigned int max_dump_cnt;
> > @@ -381,18 +380,6 @@ static int notrace ramoops_pstore_write(struct pstore_record *record)
> >       if (record->type != PSTORE_TYPE_DMESG)
> >               return -EINVAL;
> >
> > -     /*
> > -      * Out of the various dmesg dump types, ramoops is currently designed
> > -      * to only store crash logs, rather than storing general kernel logs.
> > -      */
> > -     if (record->reason != KMSG_DUMP_OOPS &&
> > -         record->reason != KMSG_DUMP_PANIC)
> > -             return -EINVAL;
> > -
> > -     /* Skip Oopes when configured to do so. */
> > -     if (record->reason == KMSG_DUMP_OOPS && !cxt->dump_oops)
> > -             return -EINVAL;
> > -
> >       /*
> >        * Explicitly only take the first part of any new crash.
> >        * If our buffer is larger than kmsg_bytes, this can never happen,
> > @@ -687,7 +674,7 @@ static int ramoops_parse_dt(struct platform_device *pdev,
> >       pdata->mem_size = resource_size(res);
> >       pdata->mem_address = res->start;
> >       pdata->mem_type = of_property_read_bool(of_node, "unbuffered");
> > -     pdata->dump_oops = !of_property_read_bool(of_node, "no-dump-oops");
> > +     dump_oops = !of_property_read_bool(of_node, "no-dump-oops");
> :
> Is this setting the module param variable? That shouldn't happen here --
> we may fail the DT and overwrite the user-configured setting for a
> different backend. This should be a local variable and the "final"
> max_reason should be calculated in this function, I think.

Hm, interesting, not sure if this is a realistic scenario. If I
understand the code correctly, dummy is the only device that can
pick-up dump_oops parameter, and it is registered before the DT based
backend:

ramoops_init(void)
   ramoops_register_dummy();  -> register dummy if mem_size is provided
   platform_driver_register(&ramoops_driver); -> register DT based node.

dummy is registered only if mem_size parameter is provided. Deprecated
dump_oops if provided by the user is converted to max_reason and
discarded after that. The value of dump_oops becomes meaningless in
/sys/module/ramoops/parameters/, as it does not really carry any
information about kmsg dumps anymore. max_reason is what carries that
information.

After the dummy backend is registered, even if DT changes dump_oops,
and still fails to register, it does not matter, as the dummy will
keep operating correctly with the set max_reason.

>
> >
> >  #define parse_size(name, field) {                                    \
> >               ret = ramoops_parse_dt_size(pdev, name, &value);        \
> > @@ -785,7 +772,6 @@ static int ramoops_probe(struct platform_device *pdev)
> >       cxt->console_size = pdata->console_size;
> >       cxt->ftrace_size = pdata->ftrace_size;
> >       cxt->pmsg_size = pdata->pmsg_size;
> > -     cxt->dump_oops = pdata->dump_oops;
> >       cxt->flags = pdata->flags;
> >       cxt->ecc_info = pdata->ecc_info;
> >
> > @@ -828,8 +814,14 @@ static int ramoops_probe(struct platform_device *pdev)
> >        * the single region size is how to check.
> >        */
> >       cxt->pstore.flags = 0;
> > -     if (cxt->max_dump_cnt)
> > +     if (cxt->max_dump_cnt) {
> >               cxt->pstore.flags |= PSTORE_FLAGS_DMESG;
> > +             if (pdata->max_reason <= 0) {
> > +                     pdata->max_reason = dump_oops ? KMSG_DUMP_OOPS :
> > +                                                     KMSG_DUMP_PANIC;
> > +             }
> > +             cxt->pstore.max_reason = pdata->max_reason;
> > +     }
>
> I'm going to take a stab at reorganizing the DT, platform data, and
> module args to have default handling done in a way that I like. I'm
> having a hard time making specific suggestions here. :)

Sure, unfortunatly I do not think we can simply remove "no-dump-oops"
from DT, and I also looked through the kernel and did not find any DTs
that use "no-dump-oops" in the kernel.

>
> >       if (cxt->console_size)
> >               cxt->pstore.flags |= PSTORE_FLAGS_CONSOLE;
> >       if (cxt->max_ftrace_cnt)
> > @@ -865,7 +857,6 @@ static int ramoops_probe(struct platform_device *pdev)
> >       mem_size = pdata->mem_size;
> >       mem_address = pdata->mem_address;
> >       record_size = pdata->record_size;
> > -     dump_oops = pdata->dump_oops;
> >       ramoops_console_size = pdata->console_size;
> >       ramoops_pmsg_size = pdata->pmsg_size;
> >       ramoops_ftrace_size = pdata->ftrace_size;
> > @@ -948,7 +939,6 @@ static void __init ramoops_register_dummy(void)
> >       pdata.console_size = ramoops_console_size;
> >       pdata.ftrace_size = ramoops_ftrace_size;
> >       pdata.pmsg_size = ramoops_pmsg_size;
> > -     pdata.dump_oops = dump_oops;
> >       pdata.flags = RAMOOPS_FLAG_FTRACE_PER_CPU;
> >
> >       /*
> > diff --git a/include/linux/pstore_ram.h b/include/linux/pstore_ram.h
> > index 9cb9b9067298..9f16afec7290 100644
> > --- a/include/linux/pstore_ram.h
> > +++ b/include/linux/pstore_ram.h
> > @@ -133,7 +133,7 @@ struct ramoops_platform_data {
> >       unsigned long   console_size;
> >       unsigned long   ftrace_size;
> >       unsigned long   pmsg_size;
> > -     int             dump_oops;
> > +     int             max_reason;
> >       u32             flags;
> >       struct persistent_ram_ecc_info ecc_info;
> >  };
> > --
> > 2.25.1
> >
>
> So, hold off on a v3, and I'll send a series tomorrow, based on what
> you've got here for v2. I like the refactoring; it's much cleaner to
> have max_reason than dump_oops! :)

Sure, I will wait for your changes.

Thank you,
Pasha

>
> --
> Kees Cook
Steven Rostedt May 6, 2020, 1:52 p.m. UTC | #4
On Tue, 5 May 2020 14:59:37 -0700
Kees Cook <keescook@chromium.org> wrote:

> > @@ -97,6 +97,8 @@ struct pstore_record {
> >   * @read_mutex:	serializes @open, @read, @close, and @erase callbacks
> >   * @flags:	bitfield of frontends the backend can accept writes for
> >   * @data:	backend-private pointer passed back during callbacks
> > + * @max_reason: Used when PSTORE_FLAGS_DMESG is set. Contains the
> > + *              kmsg_dump_reason enum value.  
> 
> Nit: please move this above @data since it has a @flags dependency.
> 
> >   *
> >   * Callbacks:
> >   *
> > @@ -180,6 +182,7 @@ struct pstore_info {
> >  
> >  	int		flags;
> >  	void		*data;
> > +	int		max_reason;

Not to mention that moving max_reason above data will fill in the hole left
by a 32 bit int, followed by a 64 bit pointer.

-- Steve


> >  
> >  	int		(*open)(struct pstore_info *psi);
> >  	int		(*close)(struct pstore_info *psi);
Pasha Tatashin May 6, 2020, 2:31 p.m. UTC | #5
On Wed, May 6, 2020 at 9:52 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Tue, 5 May 2020 14:59:37 -0700
> Kees Cook <keescook@chromium.org> wrote:
>
> > > @@ -97,6 +97,8 @@ struct pstore_record {
> > >   * @read_mutex:    serializes @open, @read, @close, and @erase callbacks
> > >   * @flags: bitfield of frontends the backend can accept writes for
> > >   * @data:  backend-private pointer passed back during callbacks
> > > + * @max_reason: Used when PSTORE_FLAGS_DMESG is set. Contains the
> > > + *              kmsg_dump_reason enum value.
> >
> > Nit: please move this above @data since it has a @flags dependency.
> >
> > >   *
> > >   * Callbacks:
> > >   *
> > > @@ -180,6 +182,7 @@ struct pstore_info {
> > >
> > >     int             flags;
> > >     void            *data;
> > > +   int             max_reason;
>
> Not to mention that moving max_reason above data will fill in the hole left
> by a 32 bit int, followed by a 64 bit pointer.

Good point. I will do it in the next version.

Thank you,
Pasha

>
> -- Steve
>
>
> > >
> > >     int             (*open)(struct pstore_info *psi);
> > >     int             (*close)(struct pstore_info *psi);
Pasha Tatashin May 6, 2020, 2:42 p.m. UTC | #6
On Tue, May 5, 2020 at 5:59 PM Kees Cook <keescook@chromium.org> wrote:
>
> On Tue, May 05, 2020 at 11:45:07AM -0400, Pavel Tatashin wrote:
> > Add a new field to pstore_info that passes information about kmesg dump
> > maximum reason.
> >
> > This allows a finer control of what kmesg dumps are stored on pstore
> > device.
> >
> > Those clients that do not explicitly set this field (keep it equal to 0),
> > get the default behavior: dump only Oops and Panics, and dump everything
> > if printk.always_kmsg_dump is provided.
> >
> > Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
> > ---
> >  fs/pstore/platform.c   | 4 +++-
> >  include/linux/pstore.h | 3 +++
> >  2 files changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
> > index 408277ee3cdb..75bf8a43f92a 100644
> > --- a/fs/pstore/platform.c
> > +++ b/fs/pstore/platform.c
> > @@ -602,8 +602,10 @@ int pstore_register(struct pstore_info *psi)
> >       if (pstore_is_mounted())
> >               pstore_get_records(0);
> >
> > -     if (psi->flags & PSTORE_FLAGS_DMESG)
> > +     if (psi->flags & PSTORE_FLAGS_DMESG) {
> > +             pstore_dumper.max_reason = psinfo->max_reason;
> >               pstore_register_kmsg();
> > +     }
>
> I haven't finished reading the whole series carefully, but I think
> something we can do here to make things a bit more user-friendly is to
> do the KMSG_DUMP_UNDEF value here to get us the "all" instead of INT_MAX:
>
>         if (psi->flags & PSTORE_FLAGS_DMESG) {
>                 pstore_dumper.max_reason = psinfo->max_reason;
>                 if (pstore_dumper.max_reason == KMSG_DUMP_UNDEF)
>                         pstore_dumper.max_reason = KMSG_DUMP_MAX;
>                 pstore_register_kmsg();
>         }
>
> That way setting max_reason to 0 without setting dump_oops at all will
> get "all". I think it'll need some tweaks to the next patch.

Hm, but if we change it this way, it will change the behavior for
other backends. With the current version of this patchset,
when psinfo->max_reason is left undefined (0, KMSG_DUMP_UNDEF) -> the
existing behaviour is honored, which means: printk chooses the kmesg
dump level, and users can set dump for all reasons via printk
always_kmsg_dump. This is what efi-pstore, erst, and nvram_64 are
currently doing, and I am not sure we should change that.
However, with the proposed change if the backend specifically sets
max_reason: printk will use it and ignore always_kmsg_dump.

Thank you,
Pasha
Petr Mladek May 14, 2020, 8:49 a.m. UTC | #7
On Tue 2020-05-05 11:45:06, Pavel Tatashin wrote:
> kmsg_dump() allows to dump kmesg buffer for various system events: oops,
> panic, reboot, etc. It provides an interface to register a callback call
> for clients, and in that callback interface there is a field "max_reason"
> which gets ignored unless always_kmsg_dump is passed as kernel parameter.
> 
> Allow clients to decide max_reason, and keep the current behavior when
> max_reason is not set.
> 
> Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
> ---
>  include/linux/kmsg_dump.h |  1 +
>  kernel/printk/printk.c    | 15 +++++++++++----
>  2 files changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/kmsg_dump.h b/include/linux/kmsg_dump.h
> index 2e7a1e032c71..cfc042066be7 100644
> --- a/include/linux/kmsg_dump.h
> +++ b/include/linux/kmsg_dump.h
> @@ -28,6 +28,7 @@ enum kmsg_dump_reason {
>  	KMSG_DUMP_RESTART,
>  	KMSG_DUMP_HALT,
>  	KMSG_DUMP_POWEROFF,
> +	KMSG_DUMP_MAX
>  };
>  
>  /**
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 9a9b6156270b..1aab69a8a2bf 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -3157,12 +3157,19 @@ void kmsg_dump(enum kmsg_dump_reason reason)
>  	struct kmsg_dumper *dumper;
>  	unsigned long flags;
>  
> -	if ((reason > KMSG_DUMP_OOPS) && !always_kmsg_dump)
> -		return;
> -
>  	rcu_read_lock();
>  	list_for_each_entry_rcu(dumper, &dump_list, list) {
> -		if (dumper->max_reason && reason > dumper->max_reason)
> +		enum kmsg_dump_reason cur_reason = dumper->max_reason;

If this code is still in the next version, please, rename this variable
to max_reason or so.

"cur" is ambiguous. It might be current dumper or current message
which confused me later in the code ;-)

Best Regards,
Petr

> +
> +		/*
> +		 * If client has not provided a specific max_reason, default
> +		 * to KMSG_DUMP_OOPS, unless always_kmsg_dump was set.
> +		 */
> +		if (cur_reason == KMSG_DUMP_UNDEF) {
> +			cur_reason = always_kmsg_dump ? KMSG_DUMP_MAX :
> +							KMSG_DUMP_OOPS;
> +		}
> +		if (reason > cur_reason)
>  			continue;
>  
>  		/* initialize iterator with data about the stored records */
> -- 
> 2.25.1