Patchwork memory hotplug: Correct page reservation checking

login
register
mail settings
Submitter Nathan Fontenot
Date Sept. 26, 2011, 3:22 p.m.
Message ID <4E8098B9.1080702@austin.ibm.com>
Download mbox | patch
Permalink /patch/116436/
State Not Applicable
Headers show

Comments

Nathan Fontenot - Sept. 26, 2011, 3:22 p.m.
The check to ensure that pages of recently added memory sections are correctly
marked as reserved before trying to online the memory is broken.  The request
to online the memory fails with the following:

kernel: section number XXX page number 256 not reserved, was it already online?

This updates the page reservation checking to check the pages of each memory
section of the memory block being onlined individually.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
---
 drivers/base/memory.c |   60 ++++++++++++++++++++++++++++++--------------------
 1 file changed, 37 insertions(+), 23 deletions(-)
Andrew Morton - Oct. 4, 2011, 12:50 a.m.
On Mon, 26 Sep 2011 10:22:33 -0500
Nathan Fontenot <nfont@austin.ibm.com> wrote:

> The check to ensure that pages of recently added memory sections are correctly
> marked as reserved before trying to online the memory is broken.  The request
> to online the memory fails with the following:
> 
> kernel: section number XXX page number 256 not reserved, was it already online?
> 
> This updates the page reservation checking to check the pages of each memory
> section of the memory block being onlined individually.

Why was this only noticed now?  Is there something unusual about the
way in which you're using it, or has nobody ever used this code, or...?
Nathan Fontenot - Oct. 4, 2011, 2:19 p.m.
On 10/03/2011 07:50 PM, Andrew Morton wrote:
> On Mon, 26 Sep 2011 10:22:33 -0500
> Nathan Fontenot <nfont@austin.ibm.com> wrote:
> 
>> The check to ensure that pages of recently added memory sections are correctly
>> marked as reserved before trying to online the memory is broken.  The request
>> to online the memory fails with the following:
>>
>> kernel: section number XXX page number 256 not reserved, was it already online?
>>
>> This updates the page reservation checking to check the pages of each memory
>> section of the memory block being onlined individually.
> 
> Why was this only noticed now?  Is there something unusual about the
> way in which you're using it, or has nobody ever used this code, or...?
> 

As far as I know it is only the powerpc/pseries code that uses the feature that
allows memory blocks in sysfs to span multiple memory sections.  We do this
because on pseries memory add/remove is done on a per LMB basis and we can have
machine where an LMB spans multiple memory sections.

This was just noticed due to a lack of testing between the 2.6.38/39 kernels where
this feature originally went in and the current mainline kernel.

-Nathan

Patch

Index: linux/drivers/base/memory.c
===================================================================
--- linux.orig/drivers/base/memory.c	2011-09-26 08:33:14.000000000 -0500
+++ linux/drivers/base/memory.c	2011-09-26 08:42:14.000000000 -0500
@@ -227,41 +227,42 @@ 
  * MEMORY_HOTPLUG depends on SPARSEMEM in mm/Kconfig, so it is
  * OK to have direct references to sparsemem variables in here.
  */
+static int check_page_reservations(unsigned long phys_index)
+{
+	int i;
+	struct page *page;
+
+	page = pfn_to_page(phys_index << PFN_SECTION_SHIFT);
+
+	for (i = 0; i < PAGES_PER_SECTION; i++) {
+		if (PageReserved(page + i))
+			continue;
+
+		printk(KERN_WARNING "section number %ld page number %d "
+			"not reserved, was it already online?\n", phys_index, i);
+			return -EBUSY;
+	}
+
+	return 0;
+}
+
 static int
 memory_block_action(unsigned long phys_index, unsigned long action)
 {
-	int i;
 	unsigned long start_pfn, start_paddr;
 	unsigned long nr_pages = PAGES_PER_SECTION * sections_per_block;
-	struct page *first_page;
+	struct page *page;
 	int ret;
 
-	first_page = pfn_to_page(phys_index << PFN_SECTION_SHIFT);
-
-	/*
-	 * The probe routines leave the pages reserved, just
-	 * as the bootmem code does.  Make sure they're still
-	 * that way.
-	 */
-	if (action == MEM_ONLINE) {
-		for (i = 0; i < nr_pages; i++) {
-			if (PageReserved(first_page+i))
-				continue;
-
-			printk(KERN_WARNING "section number %ld page number %d "
-				"not reserved, was it already online?\n",
-				phys_index, i);
-			return -EBUSY;
-		}
-	}
+	page = pfn_to_page(phys_index << PFN_SECTION_SHIFT);
 
 	switch (action) {
 		case MEM_ONLINE:
-			start_pfn = page_to_pfn(first_page);
+			start_pfn = page_to_pfn(page);
 			ret = online_pages(start_pfn, nr_pages);
 			break;
 		case MEM_OFFLINE:
-			start_paddr = page_to_pfn(first_page) << PAGE_SHIFT;
+			start_paddr = page_to_pfn(page) << PAGE_SHIFT;
 			ret = remove_memory(start_paddr,
 					    nr_pages << PAGE_SHIFT);
 			break;
@@ -277,7 +278,7 @@ 
 static int memory_block_change_state(struct memory_block *mem,
 		unsigned long to_state, unsigned long from_state_req)
 {
-	int ret = 0;
+	int i, ret = 0;
 
 	mutex_lock(&mem->state_mutex);
 
@@ -289,6 +290,19 @@ 
 	if (to_state == MEM_OFFLINE)
 		mem->state = MEM_GOING_OFFLINE;
 
+	if (to_state == MEM_ONLINE) {
+		/*
+		 * The probe routines leave the pages reserved, just
+		 * as the bootmem code does.  Make sure they're still
+		 * that way.
+		 */
+		for (i = 0; i < sections_per_block; i++) {
+			ret = check_page_reservations(mem->start_section_nr + i);
+			if (ret)
+				return ret;
+		}
+	}
+
 	ret = memory_block_action(mem->start_section_nr, to_state);
 
 	if (ret)