Message ID | alpine.DEB.2.00.1205211846120.20916@chino.kir.corp.google.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
Hi David, On Mon, 21 May 2012 18:53:37 -0700 (PDT) David Rientjes <rientjes@google.com> wrote: > > Yeah, it's sched/numa since that's what introduced numa_init(). It does > for_each_node() for each node and does a kmalloc_node() even though that > node may not be online. Slub ends up passing this node to the page > allocator through alloc_pages_exact_node(). CONFIG_DEBUG_VM would have > caught this and your config confirms its not enabled. > > sched/numa either needs a memory hotplug notifier or it needs to pass > NUMA_NO_NODE for nodes that aren't online. Until we get the former, the > following should fix it. > > > sched, numa: Allocate node_queue on any node for offline nodes > > struct node_queue must be allocated with NUMA_NO_NODE for nodes that are > not (yet) online, otherwise the page allocator has a bad zonelist. > > Signed-off-by: David Rientjes <rientjes@google.com> Thanks, that fixes it. Tested-by: Stephen Rothwell <sfr@canb.auug.org.au>
On Tue, 22 May 2012 13:03:54 +1000 Stephen Rothwell <sfr@canb.auug.org.au> wrote: > > On Mon, 21 May 2012 18:53:37 -0700 (PDT) David Rientjes <rientjes@google.com> wrote: > > > > Yeah, it's sched/numa since that's what introduced numa_init(). It does > > for_each_node() for each node and does a kmalloc_node() even though that > > node may not be online. Slub ends up passing this node to the page > > allocator through alloc_pages_exact_node(). CONFIG_DEBUG_VM would have > > caught this and your config confirms its not enabled. > > > > sched/numa either needs a memory hotplug notifier or it needs to pass > > NUMA_NO_NODE for nodes that aren't online. Until we get the former, the > > following should fix it. > > > > > > sched, numa: Allocate node_queue on any node for offline nodes > > > > struct node_queue must be allocated with NUMA_NO_NODE for nodes that are > > not (yet) online, otherwise the page allocator has a bad zonelist. > > > > Signed-off-by: David Rientjes <rientjes@google.com> > > Thanks, that fixes it. > > Tested-by: Stephen Rothwell <sfr@canb.auug.org.au> And I will put that patch in linux-next until it (or something better) appears.
diff --git a/kernel/sched/numa.c b/kernel/sched/numa.c --- a/kernel/sched/numa.c +++ b/kernel/sched/numa.c @@ -885,7 +885,8 @@ static __init int numa_init(void) for_each_node(node) { struct node_queue *nq = kmalloc_node(sizeof(*nq), - GFP_KERNEL | __GFP_ZERO, node); + GFP_KERNEL | __GFP_ZERO, + node_online(node) ? node : NUMA_NO_NODE); BUG_ON(!nq); spin_lock_init(&nq->lock);