Message ID | 5212DA31.2060105@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
On Mon, 2013-08-19 at 21:53 -0500, Nathan Fontenot wrote: > Previous commit 46723bfa540... introduced a new config option > HAVE_BOOTMEM_INFO_NODE that ended up breaking memory hot-remove for ppc > when sparse vmemmap is not defined. > > This patch defines HAVE_BOOTMEM_INFO_NODE for ppc and adds the call to > register_page_bootmem_info_node. Without this we get a BUG_ON for memory > hot remove in put_page_bootmem(). > > This also adds a stub for register_page_bootmem_memmap to allow ppc to build > with sparse vmemmap defined. > > Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> > --- So I still feel very uncomfortable with that stuff .... For example, x86 calls register_page_bootmem_info_node() at boot time, which does that strange "get_page_bootmem" on the NODE_DATA itself at boot time, we don't. Should we ? Since we don't, what do that mean ? We don't remove the node info pages on unplug ? Is that ok ? There's a whole pile of totally undocumented / uncommented generic code with horrible function names in there whose sematic is very very unclear. Now, if we call that thing, are we expected to have register_paqe_bootmem_memmap() to actually do something right? I assume that means actually calling get_page_bootmem() on the various struct page that comprise the vmemmap. Well, we can probably implement that since we maintain a list of all the vmemap pages... However, we don't implement vmemmap_free(). Should we ? This all confuses me... Cheers, Ben. > > --- > arch/powerpc/mm/init_64.c | 4 ++++ > arch/powerpc/mm/mem.c | 9 +++++++++ > mm/Kconfig | 2 +- > 3 files changed, 14 insertions(+), 1 deletion(-) > > Index: linux/arch/powerpc/mm/init_64.c > =================================================================== > --- linux.orig/arch/powerpc/mm/init_64.c > +++ linux/arch/powerpc/mm/init_64.c > @@ -300,5 +300,9 @@ void vmemmap_free(unsigned long start, u > { > } > > +void register_page_bootmem_memmap(unsigned long section_nr, > + struct page *start_page, unsigned long size) > +{ > +} > #endif /* CONFIG_SPARSEMEM_VMEMMAP */ > > Index: linux/arch/powerpc/mm/mem.c > =================================================================== > --- linux.orig/arch/powerpc/mm/mem.c > +++ linux/arch/powerpc/mm/mem.c > @@ -297,12 +297,21 @@ void __init paging_init(void) > } > #endif /* ! CONFIG_NEED_MULTIPLE_NODES */ > > +static void __init register_page_bootmem_info(void) > +{ > + int i; > + > + for_each_online_node(i) > + register_page_bootmem_info_node(NODE_DATA(i)); > +} > + > void __init mem_init(void) > { > #ifdef CONFIG_SWIOTLB > swiotlb_init(0); > #endif > > + register_page_bootmem_info(); > high_memory = (void *) __va(max_low_pfn * PAGE_SIZE); > set_max_mapnr(max_pfn); > free_all_bootmem(); > Index: linux/mm/Kconfig > =================================================================== > --- linux.orig/mm/Kconfig > +++ linux/mm/Kconfig > @@ -183,7 +183,7 @@ config MEMORY_HOTPLUG_SPARSE > config MEMORY_HOTREMOVE > bool "Allow for memory hot remove" > select MEMORY_ISOLATION > - select HAVE_BOOTMEM_INFO_NODE if X86_64 > + select HAVE_BOOTMEM_INFO_NODE if (X86_64 || PPC64) > depends on MEMORY_HOTPLUG && ARCH_ENABLE_MEMORY_HOTREMOVE > depends on MIGRATION > > > _______________________________________________ > Linuxppc-dev mailing list > Linuxppc-dev@lists.ozlabs.org > https://lists.ozlabs.org/listinfo/linuxppc-dev
On Tue, 2013-08-27 at 13:44 +1000, Benjamin Herrenschmidt wrote: > So I still feel very uncomfortable with that stuff .... > > For example, x86 calls register_page_bootmem_info_node() at boot time, > which does that strange "get_page_bootmem" on the NODE_DATA itself at > boot time, we don't. Should we ? Bah, call me an idiot ... I was looking at the code without your patch and not realizing that this is exactly what your patch does :-) .../... > There's a whole pile of totally undocumented / uncommented generic code > with horrible function names in there whose sematic is very very > unclear. > > Now, if we call that thing, are we expected to have > register_paqe_bootmem_memmap() to actually do something right? I assume > that means actually calling get_page_bootmem() on the various struct > page that comprise the vmemmap. > > Well, we can probably implement that since we maintain a list of all the > vmemap pages... However, we don't implement vmemmap_free(). Should we ? This still stands, should we actually "register" the pages of the vmemmap or not ? What happens if we remove a chunk of memory and then plug it back in ? Will it try to re-create a new vmemmap chunk for that area (where we haven't removed the previous one) ? That might cause problems if we end up putting duplicate entries in the hash table ... should we implement vmemmap_free and actual unmap the segments ? > Cheers, > Ben. > > > > > --- > > arch/powerpc/mm/init_64.c | 4 ++++ > > arch/powerpc/mm/mem.c | 9 +++++++++ > > mm/Kconfig | 2 +- > > 3 files changed, 14 insertions(+), 1 deletion(-) > > > > Index: linux/arch/powerpc/mm/init_64.c > > =================================================================== > > --- linux.orig/arch/powerpc/mm/init_64.c > > +++ linux/arch/powerpc/mm/init_64.c > > @@ -300,5 +300,9 @@ void vmemmap_free(unsigned long start, u > > { > > } > > > > +void register_page_bootmem_memmap(unsigned long section_nr, > > + struct page *start_page, unsigned long size) > > +{ > > +} > > #endif /* CONFIG_SPARSEMEM_VMEMMAP */ > > > > Index: linux/arch/powerpc/mm/mem.c > > =================================================================== > > --- linux.orig/arch/powerpc/mm/mem.c > > +++ linux/arch/powerpc/mm/mem.c > > @@ -297,12 +297,21 @@ void __init paging_init(void) > > } > > #endif /* ! CONFIG_NEED_MULTIPLE_NODES */ > > > > +static void __init register_page_bootmem_info(void) > > +{ > > + int i; > > + > > + for_each_online_node(i) > > + register_page_bootmem_info_node(NODE_DATA(i)); > > +} > > + > > void __init mem_init(void) > > { > > #ifdef CONFIG_SWIOTLB > > swiotlb_init(0); > > #endif > > > > + register_page_bootmem_info(); > > high_memory = (void *) __va(max_low_pfn * PAGE_SIZE); > > set_max_mapnr(max_pfn); > > free_all_bootmem(); > > Index: linux/mm/Kconfig > > =================================================================== > > --- linux.orig/mm/Kconfig > > +++ linux/mm/Kconfig > > @@ -183,7 +183,7 @@ config MEMORY_HOTPLUG_SPARSE > > config MEMORY_HOTREMOVE > > bool "Allow for memory hot remove" > > select MEMORY_ISOLATION > > - select HAVE_BOOTMEM_INFO_NODE if X86_64 > > + select HAVE_BOOTMEM_INFO_NODE if (X86_64 || PPC64) > > depends on MEMORY_HOTPLUG && ARCH_ENABLE_MEMORY_HOTREMOVE > > depends on MIGRATION > > > > > > _______________________________________________ > > Linuxppc-dev mailing list > > Linuxppc-dev@lists.ozlabs.org > > https://lists.ozlabs.org/listinfo/linuxppc-dev >
Index: linux/arch/powerpc/mm/init_64.c =================================================================== --- linux.orig/arch/powerpc/mm/init_64.c +++ linux/arch/powerpc/mm/init_64.c @@ -300,5 +300,9 @@ void vmemmap_free(unsigned long start, u { } +void register_page_bootmem_memmap(unsigned long section_nr, + struct page *start_page, unsigned long size) +{ +} #endif /* CONFIG_SPARSEMEM_VMEMMAP */ Index: linux/arch/powerpc/mm/mem.c =================================================================== --- linux.orig/arch/powerpc/mm/mem.c +++ linux/arch/powerpc/mm/mem.c @@ -297,12 +297,21 @@ void __init paging_init(void) } #endif /* ! CONFIG_NEED_MULTIPLE_NODES */ +static void __init register_page_bootmem_info(void) +{ + int i; + + for_each_online_node(i) + register_page_bootmem_info_node(NODE_DATA(i)); +} + void __init mem_init(void) { #ifdef CONFIG_SWIOTLB swiotlb_init(0); #endif + register_page_bootmem_info(); high_memory = (void *) __va(max_low_pfn * PAGE_SIZE); set_max_mapnr(max_pfn); free_all_bootmem(); Index: linux/mm/Kconfig =================================================================== --- linux.orig/mm/Kconfig +++ linux/mm/Kconfig @@ -183,7 +183,7 @@ config MEMORY_HOTPLUG_SPARSE config MEMORY_HOTREMOVE bool "Allow for memory hot remove" select MEMORY_ISOLATION - select HAVE_BOOTMEM_INFO_NODE if X86_64 + select HAVE_BOOTMEM_INFO_NODE if (X86_64 || PPC64) depends on MEMORY_HOTPLUG && ARCH_ENABLE_MEMORY_HOTREMOVE depends on MIGRATION
Previous commit 46723bfa540... introduced a new config option HAVE_BOOTMEM_INFO_NODE that ended up breaking memory hot-remove for ppc when sparse vmemmap is not defined. This patch defines HAVE_BOOTMEM_INFO_NODE for ppc and adds the call to register_page_bootmem_info_node. Without this we get a BUG_ON for memory hot remove in put_page_bootmem(). This also adds a stub for register_page_bootmem_memmap to allow ppc to build with sparse vmemmap defined. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> --- --- arch/powerpc/mm/init_64.c | 4 ++++ arch/powerpc/mm/mem.c | 9 +++++++++ mm/Kconfig | 2 +- 3 files changed, 14 insertions(+), 1 deletion(-)