| Submitter | Michael Ellerman |
|---|---|
| Date | Jan. 7, 2009, 6:01 a.m. |
| Message ID | <1231308086.27416.14.camel@localhost> |
| Download | mbox | patch |
| Permalink | /patch/17052/ |
| State | Not Applicable |
| Headers | show |
Comments
On Wed, Jan 07, 2009 at 05:01:26PM +1100, Michael Ellerman wrote: > Hi all, > > The patch to dynamically allocate memory regions for ppc64 kexec-tools, > ie. "ppc64: kexec memory ranges dynamic allocation" (d182ce5), has never > worked AFAICT. > > Chandru reported it as broken when it was posted: > http://lists.infradead.org/pipermail/kexec/2008-October/002751.html > > Still, it's in now, and I'm trying to work out what's going wrong. > > The symptom is as reported by Chandru, we end up not being able to > allocate any memory (in locate_hole()). This is caused by the list of > memory_ranges being empty. > > The memory_ranges are empty because they have been realloc'ed (by the > dynamic alloc code), and the generic code is still looking at the old > version. > > What I'm not clear on is why the ppc64 code is even calling > setup_memory_ranges() a second time (in elf_ppc64_load()). It's already > been called by get_memory_ranges() from my_load(). Or is there another > route into elf_ppc64_load() that I'm not seeing? > > And in fact if I just remove that call, then everything is peachy. > > The following patch makes it work for me at least. Hi Michael, I must confess that I don't have a complete understanding of this problem. Does Bernhard's recent patch (sorry that I applied it even though it came in after your patch) help this problem? commit 95c74405638c786bc76fbca5e4e8427dfe26e907 Author: Bernhard Walle <bwalle@suse.de> Date: Fri Jan 16 19:11:34 2009 +0100 Subject: Fix memory corruption when using realloc_memory_ranges() > >From 40958d8f957876ca612b666f40bebf28ea335314 Mon Sep 17 00:00:00 2001 > From: Michael Ellerman <michael@ellerman.id.au> > Date: Wed, 7 Jan 2009 16:57:05 +1100 > Subject: [PATCH] Don't call setup_memory_ranges() again > > There's no need to call setup_memory_ranges() again. Furthermore it can > lead to the memory_ranges being realloc'ed, which results in the generic > code (locate_hole() etc.) using the free'd memory_ranges. > > Signed-off-by: Michael Ellerman <michael@ellerman.id.au> > --- > kexec/arch/ppc64/kexec-elf-ppc64.c | 2 -- > 1 files changed, 0 insertions(+), 2 deletions(-) > > diff --git a/kexec/arch/ppc64/kexec-elf-ppc64.c b/kexec/arch/ppc64/kexec-elf-ppc64.c > index 21533cb..1e2d5a3 100644 > --- a/kexec/arch/ppc64/kexec-elf-ppc64.c > +++ b/kexec/arch/ppc64/kexec-elf-ppc64.c > @@ -151,8 +151,6 @@ int elf_ppc64_load(int argc, char **argv, const char *buf, off_t len, > if (ramdisk && reuse_initrd) > die("Can't specify --ramdisk or --initrd with --reuseinitrd\n"); > > - setup_memory_ranges(info->kexec_flags); > - > /* Need to append some command line parameters internally in case of > * taking crash dumps. > */ > -- > 1.5.3.7.1.g4e596e > >
On Wed, 2009-01-21 at 08:30 +1100, Simon Horman wrote: > On Wed, Jan 07, 2009 at 05:01:26PM +1100, Michael Ellerman wrote: > > Hi all, > > > > The patch to dynamically allocate memory regions for ppc64 kexec-tools, > > ie. "ppc64: kexec memory ranges dynamic allocation" (d182ce5), has never > > worked AFAICT. > > > > Chandru reported it as broken when it was posted: > > http://lists.infradead.org/pipermail/kexec/2008-October/002751.html > > > > Still, it's in now, and I'm trying to work out what's going wrong. > > > > The symptom is as reported by Chandru, we end up not being able to > > allocate any memory (in locate_hole()). This is caused by the list of > > memory_ranges being empty. > > > > The memory_ranges are empty because they have been realloc'ed (by the > > dynamic alloc code), and the generic code is still looking at the old > > version. > > > > What I'm not clear on is why the ppc64 code is even calling > > setup_memory_ranges() a second time (in elf_ppc64_load()). It's already > > been called by get_memory_ranges() from my_load(). Or is there another > > route into elf_ppc64_load() that I'm not seeing? > > > > And in fact if I just remove that call, then everything is peachy. > > > > The following patch makes it work for me at least. > > Hi Michael, > > I must confess that I don't have a complete understanding of this problem. > Does Bernhard's recent patch (sorry that I applied it even though > it came in after your patch) help this problem? Hi Horms, Well to be honest neither do I, I was hoping someone who'd written or helped write the original code would comment. Bernhard's patch will help, but I think mine is a better solution. > commit 95c74405638c786bc76fbca5e4e8427dfe26e907 > Author: Bernhard Walle <bwalle@suse.de> > Date: Fri Jan 16 19:11:34 2009 +0100 > Subject: Fix memory corruption when using realloc_memory_ranges() > Because realloc_memory_ranges() makes the old memory invalid, and we return > a pointer to memory_range in get_memory_ranges(), we need to copy the contents > in get_memory_ranges(). > > Some code that calls realloc_memory_ranges() may be triggered by > get_base_ranges() which is called after get_memory_ranges(). > > Yes, the memory needs to be deleted somewhere, but I don't know currently > where it's the best, and since it's not in a loop and memory is deleted > anyway after program termination I don't want to introduce unneccessary > complexity. The problem is that get_base_ranges() gets called from > architecture independent code and that allocation is PPC64-specific here. I don't see where get_base_ranges() is called other than from PPC64 code, so I'm confused about this comment. What I see happening is: * get_memory_ranges() is called early in kexec.c and saves a pointer to the memory ranges in "info". * Any subsequent call which causes the memory ranges to be realloc'ed will screw that up, because now info will point at free'd memory. * Later on in elf_ppc64_load() we call setup_memory_ranges() (again). * That may cause the ranges to be realloc'ed, which would be bad. * But the second call to setup_memory_ranges() is useless, because it doesn't update info, and info is what we keep using for allocations. * So if setup_memory_ranges() found new ranges, they would never be used, even apart from the corruption issue. So we may as well not call it. * If there are /other/ code paths where we can realloc memory ranges then maybe we /also/ need Bernhard's patch. But that was only a 10 minute analysis, so maybe I'm wrong ;) cheers
Patch
diff --git a/kexec/arch/ppc64/kexec-elf-ppc64.c b/kexec/arch/ppc64/kexec-elf-ppc64.c index 21533cb..1e2d5a3 100644 --- a/kexec/arch/ppc64/kexec-elf-ppc64.c +++ b/kexec/arch/ppc64/kexec-elf-ppc64.c @@ -151,8 +151,6 @@ int elf_ppc64_load(int argc, char **argv, const char *buf, off_t len, if (ramdisk && reuse_initrd) die("Can't specify --ramdisk or --initrd with --reuseinitrd\n"); - setup_memory_ranges(info->kexec_flags); - /* Need to append some command line parameters internally in case of * taking crash dumps. */