powerpc/crashkernel: take mem option into account
diff mbox series

Message ID 1568256617-14030-1-git-send-email-kernelfans@gmail.com
State New
Headers show
Series
  • powerpc/crashkernel: take mem option into account
Related show

Checks

Context Check Description
snowpatch_ozlabs/checkpatch warning total: 0 errors, 0 warnings, 1 checks, 22 lines checked
snowpatch_ozlabs/build-pmac32 success Build succeeded
snowpatch_ozlabs/build-ppc64e success Build succeeded
snowpatch_ozlabs/build-ppc64be success Build succeeded
snowpatch_ozlabs/build-ppc64le success Build succeeded
snowpatch_ozlabs/apply_patch success Successfully applied on branch next (c317052c95bef1f977b023158e5aa929215f443d)

Commit Message

Pingfan Liu Sept. 12, 2019, 2:50 a.m. UTC
'mem=" option is an easy way to put high pressure on memory during some
test. Hence in stead of total mem, the effective usable memory size should
be considered when reserving mem for crashkernel. Otherwise the boot up may
experience oom issue.

E.g passing
crashkernel="2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G", and
mem=5G on a 256G machine.

Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
To: linuxppc-dev@lists.ozlabs.org
---
v1 -> v2: fix the printk info about the total mem
 arch/powerpc/kernel/machine_kexec.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

Comments

Pingfan Liu Sept. 17, 2019, 5:29 a.m. UTC | #1
Cc Kexec list. And keep the original content.

On Thu, Sep 12, 2019 at 10:50 AM Pingfan Liu <kernelfans@gmail.com> wrote:
>
> 'mem=" option is an easy way to put high pressure on memory during some
> test. Hence in stead of total mem, the effective usable memory size should
> be considered when reserving mem for crashkernel. Otherwise the boot up may
> experience oom issue.
>
> E.g passing
> crashkernel="2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G", and
> mem=5G on a 256G machine.
>
> Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> Cc: Hari Bathini <hbathini@linux.ibm.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> To: linuxppc-dev@lists.ozlabs.org
> ---
> v1 -> v2: fix the printk info about the total mem
>  arch/powerpc/kernel/machine_kexec.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/kernel/machine_kexec.c b/arch/powerpc/kernel/machine_kexec.c
> index c4ed328..eec96dc 100644
> --- a/arch/powerpc/kernel/machine_kexec.c
> +++ b/arch/powerpc/kernel/machine_kexec.c
> @@ -114,11 +114,12 @@ void machine_kexec(struct kimage *image)
>
>  void __init reserve_crashkernel(void)
>  {
> -       unsigned long long crash_size, crash_base;
> +       unsigned long long crash_size, crash_base, total_mem_sz;
>         int ret;
>
> +       total_mem_sz = memory_limit ? memory_limit : memblock_phys_mem_size();
>         /* use common parsing */
> -       ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
> +       ret = parse_crashkernel(boot_command_line, total_mem_sz,
>                         &crash_size, &crash_base);
>         if (ret == 0 && crash_size > 0) {
>                 crashk_res.start = crash_base;
> @@ -185,7 +186,7 @@ void __init reserve_crashkernel(void)
>                         "for crashkernel (System RAM: %ldMB)\n",
>                         (unsigned long)(crash_size >> 20),
>                         (unsigned long)(crashk_res.start >> 20),
> -                       (unsigned long)(memblock_phys_mem_size() >> 20));
> +                       (unsigned long)(total_mem_sz >> 20));
>
>         if (!memblock_is_region_memory(crashk_res.start, crash_size) ||
>             memblock_reserve(crashk_res.start, crash_size)) {
> --
> 2.7.5
>
Michael Ellerman Sept. 18, 2019, 11:22 a.m. UTC | #2
Pingfan Liu <kernelfans@gmail.com> writes:
> Cc Kexec list. And keep the original content.
>
> On Thu, Sep 12, 2019 at 10:50 AM Pingfan Liu <kernelfans@gmail.com> wrote:
>>
>> 'mem=" option is an easy way to put high pressure on memory during some
>> test. Hence in stead of total mem, the effective usable memory size
               ^                          ^
               instead                    "actual" would be clearer

I think adding: "after applying the memory limit" 

would help here.

>> should be considered when reserving mem for crashkernel. Otherwise
>> the boot up may experience oom issue.
                              ^
                              OOM
>>
>> E.g passing
>> crashkernel="2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G", and
>> mem=5G on a 256G machine.

Spelling out the behaviour before and after would help here, eg:

.. "would reserve 4G prior to the change and 512M afterward."


>> Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
>> Cc: Hari Bathini <hbathini@linux.ibm.com>
>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>> To: linuxppc-dev@lists.ozlabs.org
>> ---
>> v1 -> v2: fix the printk info about the total mem
>>  arch/powerpc/kernel/machine_kexec.c | 7 ++++---
>>  1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/powerpc/kernel/machine_kexec.c b/arch/powerpc/kernel/machine_kexec.c
>> index c4ed328..eec96dc 100644
>> --- a/arch/powerpc/kernel/machine_kexec.c
>> +++ b/arch/powerpc/kernel/machine_kexec.c
>> @@ -114,11 +114,12 @@ void machine_kexec(struct kimage *image)
>>
>>  void __init reserve_crashkernel(void)
>>  {
>> -       unsigned long long crash_size, crash_base;
>> +       unsigned long long crash_size, crash_base, total_mem_sz;
>>         int ret;
>>
>> +       total_mem_sz = memory_limit ? memory_limit : memblock_phys_mem_size();
>>         /* use common parsing */
>> -       ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
>> +       ret = parse_crashkernel(boot_command_line, total_mem_sz,
>>                         &crash_size, &crash_base);

I think this change makes sense. But we have multiple arches that
implement similar logic, and I wonder if we should keep them all the
same.

eg:

  arch/arm/kernel/setup.c:                ret = parse_crashkernel(boot_command_line, total_mem,
  arch/arm64/mm/init.c:                   ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
  arch/ia64/kernel/setup.c:               ret = parse_crashkernel(boot_command_line, total,
  arch/mips/kernel/setup.c:               ret = parse_crashkernel(boot_command_line, total_mem,
  arch/powerpc/kernel/fadump.c:           ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
  arch/powerpc/kernel/machine_kexec.c:    ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
  arch/s390/kernel/setup.c:               rc = parse_crashkernel(boot_command_line, memory_end, &crash_size,
  arch/sh/kernel/machine_kexec.c:         ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
  arch/x86/kernel/setup.c:                ret = parse_crashkernel(boot_command_line, total_mem, &crash_size, &crash_base);


From a quick glance most of them don't seem to take the memory limit
into account.

So I guess the question is do we want all arches to implement the same
behaviour or do we think it doesn't matter if they differ in details
like this?

cheers

Patch
diff mbox series

diff --git a/arch/powerpc/kernel/machine_kexec.c b/arch/powerpc/kernel/machine_kexec.c
index c4ed328..eec96dc 100644
--- a/arch/powerpc/kernel/machine_kexec.c
+++ b/arch/powerpc/kernel/machine_kexec.c
@@ -114,11 +114,12 @@  void machine_kexec(struct kimage *image)
 
 void __init reserve_crashkernel(void)
 {
-	unsigned long long crash_size, crash_base;
+	unsigned long long crash_size, crash_base, total_mem_sz;
 	int ret;
 
+	total_mem_sz = memory_limit ? memory_limit : memblock_phys_mem_size();
 	/* use common parsing */
-	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
+	ret = parse_crashkernel(boot_command_line, total_mem_sz,
 			&crash_size, &crash_base);
 	if (ret == 0 && crash_size > 0) {
 		crashk_res.start = crash_base;
@@ -185,7 +186,7 @@  void __init reserve_crashkernel(void)
 			"for crashkernel (System RAM: %ldMB)\n",
 			(unsigned long)(crash_size >> 20),
 			(unsigned long)(crashk_res.start >> 20),
-			(unsigned long)(memblock_phys_mem_size() >> 20));
+			(unsigned long)(total_mem_sz >> 20));
 
 	if (!memblock_is_region_memory(crashk_res.start, crash_size) ||
 	    memblock_reserve(crashk_res.start, crash_size)) {