Message ID | 200901191700.03580.chandru@in.ibm.com (mailing list archive) |
---|---|
State | Changes Requested, archived |
Delegated to: | Benjamin Herrenschmidt |
Headers | show |
Chandru wrote: > In case either physbase or reserve_size are not page aligned and in addition > if the following condition is also true > > node_ar.end_pfn = node->node_end_pfn = PFN_DOWN(physbase+reserve_size). > > we may hit the BUG_ON(end > bdata->node_low_pfn) in mark_bootmem_node() in > mm/bootmem.c Hence pass the pfn that the physbase is part of and align > reserve_size before calling reserve_bootmem_node(). > > Signed-off-by: Chandru S <chandru@linux.vnet.ibm.com> > Cc: Dave Hansen <dave@linux.vnet.ibm.com> > --- > arch/powerpc/mm/numa.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > --- linux-2.6.29-rc2/arch/powerpc/mm/numa.c.orig 2009-01-19 16:14:49.000000000 +0530 > +++ linux-2.6.29-rc2/arch/powerpc/mm/numa.c 2009-01-19 16:36:38.000000000 +0530 > @@ -901,7 +901,8 @@ static void mark_reserved_regions_for_ni > get_node_active_region(start_pfn, &node_ar); > while (start_pfn < end_pfn && > node_ar.start_pfn < node_ar.end_pfn) { > - unsigned long reserve_size = size; > + unsigned long reserve_size = (size >> PAGE_SHIFT) << > + PAGE_SHIFT; > /* > * if reserved region extends past active region > * then trim size to active region > @@ -917,7 +918,8 @@ static void mark_reserved_regions_for_ni > dbg("reserve_bootmem %lx %lx nid=%d\n", > physbase, reserve_size, node_ar.nid); > reserve_bootmem_node(NODE_DATA(node_ar.nid), > - physbase, reserve_size, > + (start_pfn << PAGE_SHIFT), > + reserve_size, > BOOTMEM_DEFAULT); > } > /* > does this patch look good ?, do you concur with it ? thanks, Chandru
On Mon, 2009-01-19 at 17:00 +0530, Chandru wrote: > --- linux-2.6.29-rc2/arch/powerpc/mm/numa.c.orig 2009-01-19 16:14:49.000000000 +0530 > +++ linux-2.6.29-rc2/arch/powerpc/mm/numa.c 2009-01-19 16:36:38.000000000 +0530 > @@ -901,7 +901,8 @@ static void mark_reserved_regions_for_ni > get_node_active_region(start_pfn, &node_ar); > while (start_pfn < end_pfn && > node_ar.start_pfn < node_ar.end_pfn) { > - unsigned long reserve_size = size; > + unsigned long reserve_size = (size >> PAGE_SHIFT) << > + PAGE_SHIFT; > /* > * if reserved region extends past active region > * then trim size to active region > @@ -917,7 +918,8 @@ static void mark_reserved_regions_for_ni > dbg("reserve_bootmem %lx %lx nid=%d\n", > physbase, reserve_size, node_ar.nid); > reserve_bootmem_node(NODE_DATA(node_ar.nid), > - physbase, reserve_size, > + (start_pfn << PAGE_SHIFT), > + reserve_size, > BOOTMEM_DEFAULT); > } > /* Chandru, I don't mean to keep ragging on your patches, but I really don't think this is right, yet. Let's take, for instance, a 1-byte reservation. With this code, you've suddenly turned that into a 0-byte reservation, and that *can't* be right. The same thing happens if you have a reservation that spans two pages. If you unconditionally round it down, then you might miss the part that spans a portion of the second page. It needs to be rounded down like you are suggesting here, but only in the case where we've gone over the *CURRENT* node's boundary. This is kinda what that "if (end_pfn > node_ar.end_pfn)" check is doing. But, it evidently screws it up if the overlap isn't by an entire page or something. Please also, for pete's sake, use masks (a la PAGE_MASK) or macros if you're going to page-align something. Don't shift down and up like that. -- Dave
On Thursday 22 January 2009 05:59:39 Dave Hansen wrote: > Let's take, for instance, a 1-byte reservation. With this code, you've > suddenly turned that into a 0-byte reservation, and that *can't* be > right. The same thing happens if you have a reservation that spans two > pages. If you unconditionally round it down, then you might miss the > part that spans a portion of the second page. > > It needs to be rounded down like you are suggesting here, but only in > the case where we've gone over the *CURRENT* node's boundary. This is > kinda what that "if (end_pfn > node_ar.end_pfn)" check is doing. But, > it evidently screws it up if the overlap isn't by an entire page or > something. I assumed the condition 'while (start_pfn < end_pfn && .. )' asks for atleast a PAGE_SIZE difference between them and hence went ahead with that patch. My guess was a 1-byte , 2-byte or a (PAGE_SIZE -1)-byte reservations may not even go into that loop. However we just need a fix for this problem. So if there is a better fix that you have please post it to lkml. Thanks, Chandru
--- linux-2.6.29-rc2/arch/powerpc/mm/numa.c.orig 2009-01-19 16:14:49.000000000 +0530 +++ linux-2.6.29-rc2/arch/powerpc/mm/numa.c 2009-01-19 16:36:38.000000000 +0530 @@ -901,7 +901,8 @@ static void mark_reserved_regions_for_ni get_node_active_region(start_pfn, &node_ar); while (start_pfn < end_pfn && node_ar.start_pfn < node_ar.end_pfn) { - unsigned long reserve_size = size; + unsigned long reserve_size = (size >> PAGE_SHIFT) << + PAGE_SHIFT; /* * if reserved region extends past active region * then trim size to active region @@ -917,7 +918,8 @@ static void mark_reserved_regions_for_ni dbg("reserve_bootmem %lx %lx nid=%d\n", physbase, reserve_size, node_ar.nid); reserve_bootmem_node(NODE_DATA(node_ar.nid), - physbase, reserve_size, + (start_pfn << PAGE_SHIFT), + reserve_size, BOOTMEM_DEFAULT); } /*
In case either physbase or reserve_size are not page aligned and in addition if the following condition is also true node_ar.end_pfn = node->node_end_pfn = PFN_DOWN(physbase+reserve_size). we may hit the BUG_ON(end > bdata->node_low_pfn) in mark_bootmem_node() in mm/bootmem.c Hence pass the pfn that the physbase is part of and align reserve_size before calling reserve_bootmem_node(). Signed-off-by: Chandru S <chandru@linux.vnet.ibm.com> Cc: Dave Hansen <dave@linux.vnet.ibm.com> --- arch/powerpc/mm/numa.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)