Patchwork Reserve memory for kdump kernel within RMO region

login
register
mail settings
Submitter Mohan Kumar M
Date Nov. 25, 2009, 1:17 p.m.
Message ID <20091125131747.GA28857@in.ibm.com>
Download mbox | patch
Permalink /patch/39314/
State Not Applicable
Headers show

Comments

Mohan Kumar M - Nov. 25, 2009, 1:17 p.m.
Reserve memory for kdump kernel within RMO region

When the kernel size exceeds 32MB(observed with some distros), memory
for kdump kernel can not be reserved as kdump kernel base is assumed to
be 32MB always. When the kernel has CONFIG_RELOCATABLE option enabled,
provide the feature to reserve the memory for kdump kernel anywhere in
the RMO region.

Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
---
 arch/powerpc/kernel/machine_kexec.c |   19 +++++++++++++++++++
 1 files changed, 19 insertions(+), 0 deletions(-)
Bernhard Walle - Nov. 25, 2009, 6:52 p.m.
M. Mohan Kumar schrieb:
> Reserve memory for kdump kernel within RMO region
> 
> When the kernel size exceeds 32MB(observed with some distros), memory
> for kdump kernel can not be reserved as kdump kernel base is assumed to
> be 32MB always. When the kernel has CONFIG_RELOCATABLE option enabled,
> provide the feature to reserve the memory for kdump kernel anywhere in
> the RMO region.

Correct me if I'm wrong, but: CONFIG_RELOCATABLE is for the kernel that
gets loaded as crashkernel, not for the kernel that loads the
crashkernel. So it would be perfectly fine that a kernel that has not
CONFIG_RELOCATABLE set would load another kernel that has
CONFIG_RELOCATABLE set on an address != 32 M.

So it would be part of the command line to determine whether a fixed or
a variable address is used. The system configuration (or the admin)
knows both: if the kernel that should be loaded is relocatable (can be
detected with the x86 bzImage header or with the ELF type for vmlinux)
and it can also influence the boot command line.

To sum it up: I'm not against reserving it anywhere, I'm only against
making it dependent on CONFIG_RELOCATABLE which has another function.



Regards,
Bernhard
Mohan Kumar M - Nov. 26, 2009, 11:12 a.m.
On 11/26/2009 12:22 AM, Bernhard Walle wrote:
> M. Mohan Kumar schrieb:
>> Reserve memory for kdump kernel within RMO region
>>
>> When the kernel size exceeds 32MB(observed with some distros), memory
>> for kdump kernel can not be reserved as kdump kernel base is assumed to
>> be 32MB always. When the kernel has CONFIG_RELOCATABLE option enabled,
>> provide the feature to reserve the memory for kdump kernel anywhere in
>> the RMO region.
>

Hi Bernhard,

> Correct me if I'm wrong, but: CONFIG_RELOCATABLE is for the kernel that
> gets loaded as crashkernel, not for the kernel that loads the
> crashkernel. So it would be perfectly fine that a kernel that has not
> CONFIG_RELOCATABLE set would load another kernel that has
> CONFIG_RELOCATABLE set on an address != 32 M.

No, with relocatable option, the same kernel is used as both production 
and kdump kernel. If the kernel is not relocatable, kdump kernel can be 
loaded *only at* 32MB. So if a kernel has RELOCATABLE option enabled and 
by chance if the production kernel size is beyond 32MB, current code 
will not load the kdump kernel at 32MB as current kernel overlaps with 
kdump kernel region. So if the kernel has RELOCATABLE option, we could 
reserve memory for kdump kernel within RMO region.

>
> So it would be part of the command line to determine whether a fixed or
> a variable address is used. The system configuration (or the admin)
> knows both: if the kernel that should be loaded is relocatable (can be
> detected with the x86 bzImage header or with the ELF type for vmlinux)
> and it can also influence the boot command line.
>
> To sum it up: I'm not against reserving it anywhere, I'm only against
> making it dependent on CONFIG_RELOCATABLE which has another function.
>
>
>
> Regards,
> Bernhard
>
>
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
Bernhard Walle - Nov. 26, 2009, 7:26 p.m.
M. Mohan Kumar schrieb:
> On 11/26/2009 12:22 AM, Bernhard Walle wrote:
>> M. Mohan Kumar schrieb:
>>> Reserve memory for kdump kernel within RMO region
>>>
>>> When the kernel size exceeds 32MB(observed with some distros), memory
>>> for kdump kernel can not be reserved as kdump kernel base is assumed to
>>> be 32MB always. When the kernel has CONFIG_RELOCATABLE option enabled,
>>> provide the feature to reserve the memory for kdump kernel anywhere in
>>> the RMO region.
> 
> Hi Bernhard,
> 
>> Correct me if I'm wrong, but: CONFIG_RELOCATABLE is for the kernel that
>> gets loaded as crashkernel, not for the kernel that loads the
>> crashkernel. So it would be perfectly fine that a kernel that has not
>> CONFIG_RELOCATABLE set would load another kernel that has
>> CONFIG_RELOCATABLE set on an address != 32 M.
> 
> No, with relocatable option, the same kernel is used as both production 
> and kdump kernel.

Can be, but it's not strictly necessary. It depends what userland does.
Especially it's possible that a non-relocatable, self-compiled kernel
loads a relocatable distribution kernel as capture kernel.

Also, it would make sense to make the behaviour symmetric across
platforms. Currently we have:

 - x86 and ia64: Without offset on command line, use any offset
                 With offset on command line, use that offset and fail
                 if no memory is available at that offset.
 - ppc64: Always use 32M and ignore the offset.

If your patch gets applied, we have:

 - ppc64: With CONFIG_RELOCATABLE, use any offset
          With offset on command

I don't see why the behaviour on ppc64 should be completely different.

Having maintained kdump for SUSE for x86, ia64 and partly ppc64 in the
past, I always felt that ppc64 is more different from x86 than ia64 is
from x86. That's one more step into that direction without a technical
reason.

Having that all said: If your patch gets in mainline kernel, than we
should change the behaviour also for x86 and ia64.



Regards,
Bernhard
Mohan Kumar M - Nov. 27, 2009, 8:35 a.m.
On 11/27/2009 12:56 AM, Bernhard Walle wrote:
> M. Mohan Kumar schrieb:
>> On 11/26/2009 12:22 AM, Bernhard Walle wrote:
>>> M. Mohan Kumar schrieb:
>>>> Reserve memory for kdump kernel within RMO region
>>>>
>>>> When the kernel size exceeds 32MB(observed with some distros), memory
>>>> for kdump kernel can not be reserved as kdump kernel base is assumed to
>>>> be 32MB always. When the kernel has CONFIG_RELOCATABLE option enabled,
>>>> provide the feature to reserve the memory for kdump kernel anywhere in
>>>> the RMO region.
>>
>> Hi Bernhard,
>>
>>> Correct me if I'm wrong, but: CONFIG_RELOCATABLE is for the kernel that
>>> gets loaded as crashkernel, not for the kernel that loads the
>>> crashkernel. So it would be perfectly fine that a kernel that has not
>>> CONFIG_RELOCATABLE set would load another kernel that has
>>> CONFIG_RELOCATABLE set on an address != 32 M.
>>
>> No, with relocatable option, the same kernel is used as both production
>> and kdump kernel.
>
> Can be, but it's not strictly necessary. It depends what userland does.
> Especially it's possible that a non-relocatable, self-compiled kernel
> loads a relocatable distribution kernel as capture kernel.
>

I don't understand why a non-relocatable kernel will use relocatable 
kernel for capturing kdump kernel. The idea for relocatable kernel is to 
avoid using two different kernels to capture kernel dump.

> Also, it would make sense to make the behaviour symmetric across
> platforms. Currently we have:
>
>   - x86 and ia64: Without offset on command line, use any offset
>                   With offset on command line, use that offset and fail
>                   if no memory is available at that offset.
>   - ppc64: Always use 32M and ignore the offset.
>
> If your patch gets applied, we have:
>
>   - ppc64: With CONFIG_RELOCATABLE, use any offset
>            With offset on command
>
> I don't see why the behaviour on ppc64 should be completely different.
>
> Having maintained kdump for SUSE for x86, ia64 and partly ppc64 in the
> past, I always felt that ppc64 is more different from x86 than ia64 is
> from x86. That's one more step into that direction without a technical
> reason.

Also with the crashkernel=auto parameter (patches are not yet merged), 
the crashkernel base (offset) by default would be 32MB. In this case if 
a kernel passed with crashkernel=auto and if the first kernel size 
exceeds 32MB, memory for kdump kernel will always fail.

>
> Having that all said: If your patch gets in mainline kernel, than we
> should change the behaviour also for x86 and ia64.
>
>
>
> Regards,
> Bernhard
Simon Horman - Nov. 27, 2009, 11:51 a.m.
On Fri, Nov 27, 2009 at 02:05:46PM +0530, M. Mohan Kumar wrote:
> On 11/27/2009 12:56 AM, Bernhard Walle wrote:
> >M. Mohan Kumar schrieb:
> >>On 11/26/2009 12:22 AM, Bernhard Walle wrote:
> >>>M. Mohan Kumar schrieb:
> >>>>Reserve memory for kdump kernel within RMO region
> >>>>
> >>>>When the kernel size exceeds 32MB(observed with some distros), memory
> >>>>for kdump kernel can not be reserved as kdump kernel base is assumed to
> >>>>be 32MB always. When the kernel has CONFIG_RELOCATABLE option enabled,
> >>>>provide the feature to reserve the memory for kdump kernel anywhere in
> >>>>the RMO region.
> >>
> >>Hi Bernhard,
> >>
> >>>Correct me if I'm wrong, but: CONFIG_RELOCATABLE is for the kernel that
> >>>gets loaded as crashkernel, not for the kernel that loads the
> >>>crashkernel. So it would be perfectly fine that a kernel that has not
> >>>CONFIG_RELOCATABLE set would load another kernel that has
> >>>CONFIG_RELOCATABLE set on an address != 32 M.
> >>
> >>No, with relocatable option, the same kernel is used as both production
> >>and kdump kernel.
> >
> >Can be, but it's not strictly necessary. It depends what userland does.
> >Especially it's possible that a non-relocatable, self-compiled kernel
> >loads a relocatable distribution kernel as capture kernel.
> >
> 
> I don't understand why a non-relocatable kernel will use relocatable
> kernel for capturing kdump kernel. The idea for relocatable kernel
> is to avoid using two different kernels to capture kernel dump.

True, but that doesn't necessarily mean that using a relocatable
kdump kernel is required. Well, hopefully not.

> >Also, it would make sense to make the behaviour symmetric across
> >platforms. Currently we have:
> >
> >  - x86 and ia64: Without offset on command line, use any offset
> >                  With offset on command line, use that offset and fail
> >                  if no memory is available at that offset.
> >  - ppc64: Always use 32M and ignore the offset.
> >
> >If your patch gets applied, we have:
> >
> >  - ppc64: With CONFIG_RELOCATABLE, use any offset
> >           With offset on command
> >
> >I don't see why the behaviour on ppc64 should be completely different.
> >
> >Having maintained kdump for SUSE for x86, ia64 and partly ppc64 in the
> >past, I always felt that ppc64 is more different from x86 than ia64 is
> >from x86. That's one more step into that direction without a technical
> >reason.
> 
> Also with the crashkernel=auto parameter (patches are not yet
> merged), the crashkernel base (offset) by default would be 32MB. In
> this case if a kernel passed with crashkernel=auto and if the first
> kernel size exceeds 32MB, memory for kdump kernel will always fail.
> 
> >
> >Having that all said: If your patch gets in mainline kernel, than we
> >should change the behaviour also for x86 and ia64.
> >
> >
> >
> >Regards,
> >Bernhard
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
Mohan Kumar M - Nov. 27, 2009, 12:54 p.m.
Hi,

As of now the kdump kernel base is fixed to be 32MB. The intention of 
this patch is to modify that behaviour (for relocatable kernels)

* Regular kernel size may exceed 32MB, in this case we can't have kdump 
kernelbase as 32MB.

* crashkernel=auto also assumes that kdump kernelbase as 32MB, and it 
may also fail in reserving memory for kdump kernel.

On 11/27/2009 12:56 AM, Bernhard Walle wrote:
> M. Mohan Kumar schrieb:
>> On 11/26/2009 12:22 AM, Bernhard Walle wrote:
>>> M. Mohan Kumar schrieb:
>>>> Reserve memory for kdump kernel within RMO region
>>>>
>>>> When the kernel size exceeds 32MB(observed with some distros), memory
>>>> for kdump kernel can not be reserved as kdump kernel base is assumed to
>>>> be 32MB always. When the kernel has CONFIG_RELOCATABLE option enabled,
>>>> provide the feature to reserve the memory for kdump kernel anywhere in
>>>> the RMO region.
>>
>> Hi Bernhard,
>>
>>> Correct me if I'm wrong, but: CONFIG_RELOCATABLE is for the kernel that
>>> gets loaded as crashkernel, not for the kernel that loads the
>>> crashkernel. So it would be perfectly fine that a kernel that has not
>>> CONFIG_RELOCATABLE set would load another kernel that has
>>> CONFIG_RELOCATABLE set on an address != 32 M.
>>
>> No, with relocatable option, the same kernel is used as both production
>> and kdump kernel.
>
> Can be, but it's not strictly necessary. It depends what userland does.
> Especially it's possible that a non-relocatable, self-compiled kernel
> loads a relocatable distribution kernel as capture kernel.
>
> Also, it would make sense to make the behaviour symmetric across
> platforms. Currently we have:
>
>   - x86 and ia64: Without offset on command line, use any offset
>                   With offset on command line, use that offset and fail
>                   if no memory is available at that offset.
>   - ppc64: Always use 32M and ignore the offset.
>
> If your patch gets applied, we have:
>
>   - ppc64: With CONFIG_RELOCATABLE, use any offset
>            With offset on command
>
> I don't see why the behaviour on ppc64 should be completely different.
>
> Having maintained kdump for SUSE for x86, ia64 and partly ppc64 in the
> past, I always felt that ppc64 is more different from x86 than ia64 is
> from x86. That's one more step into that direction without a technical
> reason.
>
> Having that all said: If your patch gets in mainline kernel, than we
> should change the behaviour also for x86 and ia64.
>
>
>
> Regards,
> Bernhard
>
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
Bernhard Walle - Nov. 27, 2009, 6:39 p.m.
M. Mohan Kumar schrieb:
> Hi,
> 
> As of now the kdump kernel base is fixed to be 32MB. The intention of 
> this patch is to modify that behaviour (for relocatable kernels)
> 
> * Regular kernel size may exceed 32MB, in this case we can't have kdump 
> kernelbase as 32MB.
> 
> * crashkernel=auto also assumes that kdump kernelbase as 32MB, and it 
> may also fail in reserving memory for kdump kernel.

I'm not opposed to remove the 32MB restriction, but I would like to make
the behaviour independent of CONFIG_RELOCATABLE like it's on x86:

 - Use 32M if one specifies xxx@32M
 - Use preferrably 32M, but fall back to another address if one
   specifies xxx or xxx@0M.

I don't see any problem with that. But it would make behaviour equal
accross platforms.



Regards,
Bernhard

Patch

diff --git a/arch/powerpc/kernel/machine_kexec.c b/arch/powerpc/kernel/machine_kexec.c
index baf1af0..99b2f9f 100644
--- a/arch/powerpc/kernel/machine_kexec.c
+++ b/arch/powerpc/kernel/machine_kexec.c
@@ -140,10 +140,29 @@  void __init reserve_crashkernel(void)
 
 	/* The crash region must not overlap the current kernel */
 	if (overlaps_crashkernel(__pa(_stext), _end - _stext)) {
+#ifdef CONFIG_RELOCATABLE
+		do {
+			/* Align kdump kernel to 16MB (size of large page) */
+			crashk_res.start = ALIGN(crashk_res.start +
+						(16 * 1024 * 1024), 0x1000000);
+			if (crashk_res.start + (_stext - _end) > lmb.rmo_size) {
+				printk(KERN_WARNING
+					"Not enough memory for crash kernel\n");
+				crashk_res.start = crashk_res.end = 0;
+				return;
+			}
+		} while (overlaps_crashkernel(__pa(_stext), _end - _stext));
+
+		crashk_res.end = crashk_res.start + crash_size - 1;
+		printk(KERN_INFO
+			"crash kernel memory overlaps with kernel memory\n"
+			"Moving it to %lx\n", (unsigned long)crashk_res.start);
+#else
 		printk(KERN_WARNING
 			"Crash kernel can not overlap current kernel\n");
 		crashk_res.start = crashk_res.end = 0;
 		return;
+#endif
 	}
 
 	/* Crash kernel trumps memory limit */