diff mbox series

[PATCHv3] powerpc/crashkernel: take "mem=" option into account

Message ID 1582121897-24336-1-git-send-email-kernelfans@gmail.com (mailing list archive)
State Superseded
Headers show
Series [PATCHv3] powerpc/crashkernel: take "mem=" option into account | expand

Checks

Context Check Description
snowpatch_ozlabs/apply_patch success Successfully applied on branch powerpc/merge (65b2623f395a4e25ab3ff4cff1c9c7623619a22d)
snowpatch_ozlabs/build-ppc64le success Build succeeded
snowpatch_ozlabs/build-ppc64be success Build succeeded
snowpatch_ozlabs/build-ppc64e success Build succeeded
snowpatch_ozlabs/build-pmac32 success Build succeeded
snowpatch_ozlabs/checkpatch warning total: 0 errors, 1 warnings, 1 checks, 22 lines checked
snowpatch_ozlabs/needsstable success Patch has no Fixes tags

Commit Message

Pingfan Liu Feb. 19, 2020, 2:18 p.m. UTC
'mem=" option is an easy way to put high pressure on memory during some
test. Hence after applying the memory limit, instead of total mem, the
actual usable memory should be considered when reserving mem for
crashkernel. Otherwise the boot up may experience OOM issue.

E.g. it would reserve 4G prior to the change and 512M afterward, if passing
crashkernel="2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G", and
mem=5G on a 256G machine.

This issue is powerpc specific because it puts higher priority on fadump
and kdump reservation than on "mem=". Referring the following code:
	if (fadump_reserve_mem() == 0)
		reserve_crashkernel();
	...
	/* Ensure that total memory size is page-aligned. */
	limit = ALIGN(memory_limit ?: memblock_phys_mem_size(), PAGE_SIZE);
	memblock_enforce_memory_limit(limit);

While on other arches, the effect of "mem=" takes a higher priority and pass
through memblock_phys_mem_size() before calling reserve_crashkernel().

Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
To: linuxppc-dev@lists.ozlabs.org
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: kexec@lists.infradead.org
---
v2 -> v3: improve commit log
 arch/powerpc/kernel/machine_kexec.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

Comments

Hari Bathini March 26, 2020, 7:48 a.m. UTC | #1
Hello Pingfan,

Thanks for the patch..

On 19/02/20 7:48 PM, Pingfan Liu wrote:
> 'mem=" option is an easy way to put high pressure on memory during some
> test. Hence after applying the memory limit, instead of total mem, the
> actual usable memory should be considered when reserving mem for
> crashkernel. Otherwise the boot up may experience OOM issue.
> 
> E.g. it would reserve 4G prior to the change and 512M afterward, if passing
> crashkernel="2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G", and
> mem=5G on a 256G machine.
> 
> This issue is powerpc specific because it puts higher priority on fadump
> and kdump reservation than on "mem=". Referring the following code:
> 	if (fadump_reserve_mem() == 0)
> 		reserve_crashkernel();
> 	...
> 	/* Ensure that total memory size is page-aligned. */
> 	limit = ALIGN(memory_limit ?: memblock_phys_mem_size(), PAGE_SIZE);
> 	memblock_enforce_memory_limit(limit);
> 
> While on other arches, the effect of "mem=" takes a higher priority and pass
> through memblock_phys_mem_size() before calling reserve_crashkernel().
> 
> Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> To: linuxppc-dev@lists.ozlabs.org
> Cc: Hari Bathini <hbathini@linux.ibm.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: kexec@lists.infradead.org
> ---
> v2 -> v3: improve commit log
>  arch/powerpc/kernel/machine_kexec.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/machine_kexec.c b/arch/powerpc/kernel/machine_kexec.c
> index c4ed328..eec96dc 100644
> --- a/arch/powerpc/kernel/machine_kexec.c
> +++ b/arch/powerpc/kernel/machine_kexec.c
> @@ -114,11 +114,12 @@ void machine_kexec(struct kimage *image)
>  
>  void __init reserve_crashkernel(void)
>  {
> -	unsigned long long crash_size, crash_base;
> +	unsigned long long crash_size, crash_base, total_mem_sz;
>  	int ret;
>  
> +	total_mem_sz = memory_limit ? memory_limit : memblock_phys_mem_size();
>  	/* use common parsing */
> -	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
> +	ret = parse_crashkernel(boot_command_line, total_mem_sz,
>  			&crash_size, &crash_base);
>  	if (ret == 0 && crash_size > 0) {
>  		crashk_res.start = crash_base;

memory_limit is adjusted after this with the below snippet:

        /* Crash kernel trumps memory limit */
        if (memory_limit && memory_limit <= crashk_res.end) {                           
                memory_limit = crashk_res.end + 1;
                printk("Adjusted memory limit for crashkernel, now 0x%llx\n",           
                       memory_limit);                                                   
        } 

So, either the above snippet must be dropped or the print below should use an updated total_mem_sz
based on adjusted memory_limit. I would prefer the latter..

> @@ -185,7 +186,7 @@ void __init reserve_crashkernel(void)
>  			"for crashkernel (System RAM: %ldMB)\n",
>  			(unsigned long)(crash_size >> 20),
>  			(unsigned long)(crashk_res.start >> 20),
> -			(unsigned long)(memblock_phys_mem_size() >> 20));
> +			(unsigned long)(total_mem_sz >> 20));
>  
>  	if (!memblock_is_region_memory(crashk_res.start, crash_size) ||
>  	    memblock_reserve(crashk_res.start, crash_size)) {
>
diff mbox series

Patch

diff --git a/arch/powerpc/kernel/machine_kexec.c b/arch/powerpc/kernel/machine_kexec.c
index c4ed328..eec96dc 100644
--- a/arch/powerpc/kernel/machine_kexec.c
+++ b/arch/powerpc/kernel/machine_kexec.c
@@ -114,11 +114,12 @@  void machine_kexec(struct kimage *image)
 
 void __init reserve_crashkernel(void)
 {
-	unsigned long long crash_size, crash_base;
+	unsigned long long crash_size, crash_base, total_mem_sz;
 	int ret;
 
+	total_mem_sz = memory_limit ? memory_limit : memblock_phys_mem_size();
 	/* use common parsing */
-	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
+	ret = parse_crashkernel(boot_command_line, total_mem_sz,
 			&crash_size, &crash_base);
 	if (ret == 0 && crash_size > 0) {
 		crashk_res.start = crash_base;
@@ -185,7 +186,7 @@  void __init reserve_crashkernel(void)
 			"for crashkernel (System RAM: %ldMB)\n",
 			(unsigned long)(crash_size >> 20),
 			(unsigned long)(crashk_res.start >> 20),
-			(unsigned long)(memblock_phys_mem_size() >> 20));
+			(unsigned long)(total_mem_sz >> 20));
 
 	if (!memblock_is_region_memory(crashk_res.start, crash_size) ||
 	    memblock_reserve(crashk_res.start, crash_size)) {