Patchwork [v2,2/2] Register bootmem pages

login
register
mail settings
Submitter Nathan Fontenot
Date Aug. 20, 2013, 2:53 a.m.
Message ID <5212DA31.2060105@linux.vnet.ibm.com>
Download mbox | patch
Permalink /patch/268351/
State Superseded
Headers show

Comments

Nathan Fontenot - Aug. 20, 2013, 2:53 a.m.
Previous commit 46723bfa540... introduced a new config option
HAVE_BOOTMEM_INFO_NODE that ended up breaking memory hot-remove for ppc
when sparse vmemmap is not defined.

This patch defines HAVE_BOOTMEM_INFO_NODE for ppc and adds the call to
register_page_bootmem_info_node. Without this we get a BUG_ON for memory
hot remove in put_page_bootmem().

This also adds a stub for register_page_bootmem_memmap to allow ppc to build
with sparse vmemmap defined.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
---

---
 arch/powerpc/mm/init_64.c |    4 ++++
 arch/powerpc/mm/mem.c     |    9 +++++++++
 mm/Kconfig                |    2 +-
 3 files changed, 14 insertions(+), 1 deletion(-)
Benjamin Herrenschmidt - Aug. 27, 2013, 3:44 a.m.
On Mon, 2013-08-19 at 21:53 -0500, Nathan Fontenot wrote:
> Previous commit 46723bfa540... introduced a new config option
> HAVE_BOOTMEM_INFO_NODE that ended up breaking memory hot-remove for ppc
> when sparse vmemmap is not defined.
> 
> This patch defines HAVE_BOOTMEM_INFO_NODE for ppc and adds the call to
> register_page_bootmem_info_node. Without this we get a BUG_ON for memory
> hot remove in put_page_bootmem().
> 
> This also adds a stub for register_page_bootmem_memmap to allow ppc to build
> with sparse vmemmap defined.
> 
> Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
> ---

So I still feel very uncomfortable with that stuff ....

For example, x86 calls register_page_bootmem_info_node() at boot time,
which does that strange "get_page_bootmem" on the NODE_DATA itself at
boot time, we don't. Should we ?

Since we don't, what do that mean ? We don't remove the node info pages
on unplug ? Is that ok ?

There's a whole pile of totally undocumented / uncommented generic code
with horrible function names in there whose sematic is very very
unclear.

Now, if we call that thing, are we expected to have
register_paqe_bootmem_memmap() to actually do something right? I assume
that means actually calling get_page_bootmem() on the various struct
page that comprise the vmemmap.

Well, we can probably implement that since we maintain a list of all the
vmemap pages... However, we don't implement vmemmap_free(). Should we ?

This all confuses me...

Cheers,
Ben.

> 
> ---
>  arch/powerpc/mm/init_64.c |    4 ++++
>  arch/powerpc/mm/mem.c     |    9 +++++++++
>  mm/Kconfig                |    2 +-
>  3 files changed, 14 insertions(+), 1 deletion(-)
> 
> Index: linux/arch/powerpc/mm/init_64.c
> ===================================================================
> --- linux.orig/arch/powerpc/mm/init_64.c
> +++ linux/arch/powerpc/mm/init_64.c
> @@ -300,5 +300,9 @@ void vmemmap_free(unsigned long start, u
>  {
>  }
> 
> +void register_page_bootmem_memmap(unsigned long section_nr,
> +				  struct page *start_page, unsigned long size)
> +{
> +}
>  #endif /* CONFIG_SPARSEMEM_VMEMMAP */
> 
> Index: linux/arch/powerpc/mm/mem.c
> ===================================================================
> --- linux.orig/arch/powerpc/mm/mem.c
> +++ linux/arch/powerpc/mm/mem.c
> @@ -297,12 +297,21 @@ void __init paging_init(void)
>  }
>  #endif /* ! CONFIG_NEED_MULTIPLE_NODES */
> 
> +static void __init register_page_bootmem_info(void)
> +{
> +	int i;
> +
> +	for_each_online_node(i)
> +		register_page_bootmem_info_node(NODE_DATA(i));
> +}
> +
>  void __init mem_init(void)
>  {
>  #ifdef CONFIG_SWIOTLB
>  	swiotlb_init(0);
>  #endif
> 
> +	register_page_bootmem_info();
>  	high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
>  	set_max_mapnr(max_pfn);
>  	free_all_bootmem();
> Index: linux/mm/Kconfig
> ===================================================================
> --- linux.orig/mm/Kconfig
> +++ linux/mm/Kconfig
> @@ -183,7 +183,7 @@ config MEMORY_HOTPLUG_SPARSE
>  config MEMORY_HOTREMOVE
>  	bool "Allow for memory hot remove"
>  	select MEMORY_ISOLATION
> -	select HAVE_BOOTMEM_INFO_NODE if X86_64
> +	select HAVE_BOOTMEM_INFO_NODE if (X86_64 || PPC64)
>  	depends on MEMORY_HOTPLUG && ARCH_ENABLE_MEMORY_HOTREMOVE
>  	depends on MIGRATION
> 
> 
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
Benjamin Herrenschmidt - Aug. 27, 2013, 7:39 a.m.
On Tue, 2013-08-27 at 13:44 +1000, Benjamin Herrenschmidt wrote:

> So I still feel very uncomfortable with that stuff ....
> 
> For example, x86 calls register_page_bootmem_info_node() at boot time,
> which does that strange "get_page_bootmem" on the NODE_DATA itself at
> boot time, we don't. Should we ?

Bah, call me an idiot ... I was looking at the code without your patch
and not realizing that this is exactly what your patch does :-)

 .../...

> There's a whole pile of totally undocumented / uncommented generic code
> with horrible function names in there whose sematic is very very
> unclear.
> 
> Now, if we call that thing, are we expected to have
> register_paqe_bootmem_memmap() to actually do something right? I assume
> that means actually calling get_page_bootmem() on the various struct
> page that comprise the vmemmap.
> 
> Well, we can probably implement that since we maintain a list of all the
> vmemap pages... However, we don't implement vmemmap_free(). Should we ?

This still stands, should we actually "register" the pages of the
vmemmap or not ?

What happens if we remove a chunk of memory and then plug it back in ?
Will it try to re-create a new vmemmap chunk for that area (where we
haven't removed the previous one) ? That might cause problems if we end
up putting duplicate entries in the hash table ... should we implement
vmemmap_free and actual unmap the segments ?

> Cheers,
> Ben.
> 
> > 
> > ---
> >  arch/powerpc/mm/init_64.c |    4 ++++
> >  arch/powerpc/mm/mem.c     |    9 +++++++++
> >  mm/Kconfig                |    2 +-
> >  3 files changed, 14 insertions(+), 1 deletion(-)
> > 
> > Index: linux/arch/powerpc/mm/init_64.c
> > ===================================================================
> > --- linux.orig/arch/powerpc/mm/init_64.c
> > +++ linux/arch/powerpc/mm/init_64.c
> > @@ -300,5 +300,9 @@ void vmemmap_free(unsigned long start, u
> >  {
> >  }
> > 
> > +void register_page_bootmem_memmap(unsigned long section_nr,
> > +				  struct page *start_page, unsigned long size)
> > +{
> > +}
> >  #endif /* CONFIG_SPARSEMEM_VMEMMAP */
> > 
> > Index: linux/arch/powerpc/mm/mem.c
> > ===================================================================
> > --- linux.orig/arch/powerpc/mm/mem.c
> > +++ linux/arch/powerpc/mm/mem.c
> > @@ -297,12 +297,21 @@ void __init paging_init(void)
> >  }
> >  #endif /* ! CONFIG_NEED_MULTIPLE_NODES */
> > 
> > +static void __init register_page_bootmem_info(void)
> > +{
> > +	int i;
> > +
> > +	for_each_online_node(i)
> > +		register_page_bootmem_info_node(NODE_DATA(i));
> > +}
> > +
> >  void __init mem_init(void)
> >  {
> >  #ifdef CONFIG_SWIOTLB
> >  	swiotlb_init(0);
> >  #endif
> > 
> > +	register_page_bootmem_info();
> >  	high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
> >  	set_max_mapnr(max_pfn);
> >  	free_all_bootmem();
> > Index: linux/mm/Kconfig
> > ===================================================================
> > --- linux.orig/mm/Kconfig
> > +++ linux/mm/Kconfig
> > @@ -183,7 +183,7 @@ config MEMORY_HOTPLUG_SPARSE
> >  config MEMORY_HOTREMOVE
> >  	bool "Allow for memory hot remove"
> >  	select MEMORY_ISOLATION
> > -	select HAVE_BOOTMEM_INFO_NODE if X86_64
> > +	select HAVE_BOOTMEM_INFO_NODE if (X86_64 || PPC64)
> >  	depends on MEMORY_HOTPLUG && ARCH_ENABLE_MEMORY_HOTREMOVE
> >  	depends on MIGRATION
> > 
> > 
> > _______________________________________________
> > Linuxppc-dev mailing list
> > Linuxppc-dev@lists.ozlabs.org
> > https://lists.ozlabs.org/listinfo/linuxppc-dev
>

Patch

Index: linux/arch/powerpc/mm/init_64.c
===================================================================
--- linux.orig/arch/powerpc/mm/init_64.c
+++ linux/arch/powerpc/mm/init_64.c
@@ -300,5 +300,9 @@  void vmemmap_free(unsigned long start, u
 {
 }

+void register_page_bootmem_memmap(unsigned long section_nr,
+				  struct page *start_page, unsigned long size)
+{
+}
 #endif /* CONFIG_SPARSEMEM_VMEMMAP */

Index: linux/arch/powerpc/mm/mem.c
===================================================================
--- linux.orig/arch/powerpc/mm/mem.c
+++ linux/arch/powerpc/mm/mem.c
@@ -297,12 +297,21 @@  void __init paging_init(void)
 }
 #endif /* ! CONFIG_NEED_MULTIPLE_NODES */

+static void __init register_page_bootmem_info(void)
+{
+	int i;
+
+	for_each_online_node(i)
+		register_page_bootmem_info_node(NODE_DATA(i));
+}
+
 void __init mem_init(void)
 {
 #ifdef CONFIG_SWIOTLB
 	swiotlb_init(0);
 #endif

+	register_page_bootmem_info();
 	high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
 	set_max_mapnr(max_pfn);
 	free_all_bootmem();
Index: linux/mm/Kconfig
===================================================================
--- linux.orig/mm/Kconfig
+++ linux/mm/Kconfig
@@ -183,7 +183,7 @@  config MEMORY_HOTPLUG_SPARSE
 config MEMORY_HOTREMOVE
 	bool "Allow for memory hot remove"
 	select MEMORY_ISOLATION
-	select HAVE_BOOTMEM_INFO_NODE if X86_64
+	select HAVE_BOOTMEM_INFO_NODE if (X86_64 || PPC64)
 	depends on MEMORY_HOTPLUG && ARCH_ENABLE_MEMORY_HOTREMOVE
 	depends on MIGRATION