Message ID | bcccffb8-8472-463d-cca5-698db5cbc8e1@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Rejected |
Headers | show |
Series | pseries/nodes: Fix issues with memoryless nodes | expand |
Hi Michael, Michael Bringmann <mwb@linux.vnet.ibm.com> writes: > pseries/findnodes: On pseries systems which allow 'hot-add' of This isn't a powerpc or pseries patch, so the subject/prefix is wrong. Also because you're changing generic code you need to provide an explanation that makes sense in general, across all architectures, not just in terms of what the pseries platform does. > resources, we may boot configurations that have CPUs, but no memory > associated to a node by the affinity calculations. This is called a "memory-less node" and is understood by the generic code. > Previously, the > software took a shortcut to collapse initialization and references What software? What shortcut? > to such memoryless nodes with other nodes that did have memory > associated with them at boot. This patch is based on fixes that What fixes? > allow the proper initialization and distinguishment of memoryless > and memory-plus nodes after NUMA initialization. What exactly is unproper about the current code? > It extends the > use of the 'node_to_mem_node()' API from 'topology.h' to modules The term "modules" has a specific meaning in Linux which is not correct here. We would just say "in two functions" or "in two files". > that are allocating node-specific memory at boot, and allows such > references to find available memory in another node. > diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c > index 9f8cffc..a27a31f 100644 > --- a/block/blk-mq-cpumap.c > +++ b/block/blk-mq-cpumap.c > @@ -73,7 +73,8 @@ int blk_mq_hw_queue_to_node(unsigned int *mq_map, unsigned int index) > > for_each_possible_cpu(i) { > if (index == mq_map[i]) > - return local_memory_node(cpu_to_node(i)); > + return local_memory_node( > + node_to_mem_node(cpu_to_node(i))); What is this trying to do? local_memory_node() is supposed to return a "local" node for nodes with no memory. And in fact the comment says: * Used for initializing percpu 'numa_mem' Which is what we do: set_numa_mem(local_memory_node(numa_cpu_lookup_table[cpu])); And is what's returned by node_to_mem_node(): static inline void set_numa_mem(int node) { this_cpu_write(_numa_mem_, node); _node_numa_mem_[numa_node_id()] = node; } static inline int node_to_mem_node(int node) { return _node_numa_mem_[node]; } So your change effectively ends up doing: return local_memory_node(local_memory_node(cpu_to_node(i))); Which doesn't look right. cheers
Hello: Sorry for the out-of-date description. This entire patch has been removed / eliminated from subsequent patch sets. All changes to correct powerpc memoryless nodes will be confined to powerpc-specific code. Regards, Michael On 10/19/2017 03:56 AM, Michael Ellerman wrote: > Hi Michael, > > Michael Bringmann <mwb@linux.vnet.ibm.com> writes: >> pseries/findnodes: On pseries systems which allow 'hot-add' of > > This isn't a powerpc or pseries patch, so the subject/prefix is wrong. > > Also because you're changing generic code you need to provide an > explanation that makes sense in general, across all architectures, not > just in terms of what the pseries platform does. > >> resources, we may boot configurations that have CPUs, but no memory >> associated to a node by the affinity calculations. > > This is called a "memory-less node" and is understood by the generic > code. > >> Previously, the >> software took a shortcut to collapse initialization and references > > What software? What shortcut? > >> to such memoryless nodes with other nodes that did have memory >> associated with them at boot. This patch is based on fixes that > > What fixes? > >> allow the proper initialization and distinguishment of memoryless >> and memory-plus nodes after NUMA initialization. > > What exactly is unproper about the current code? > >> It extends the >> use of the 'node_to_mem_node()' API from 'topology.h' to modules > > The term "modules" has a specific meaning in Linux which is not correct > here. We would just say "in two functions" or "in two files". > >> that are allocating node-specific memory at boot, and allows such >> references to find available memory in another node. > > >> diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c >> index 9f8cffc..a27a31f 100644 >> --- a/block/blk-mq-cpumap.c >> +++ b/block/blk-mq-cpumap.c >> @@ -73,7 +73,8 @@ int blk_mq_hw_queue_to_node(unsigned int *mq_map, unsigned int index) >> >> for_each_possible_cpu(i) { >> if (index == mq_map[i]) >> - return local_memory_node(cpu_to_node(i)); >> + return local_memory_node( >> + node_to_mem_node(cpu_to_node(i))); > > What is this trying to do? > > local_memory_node() is supposed to return a "local" node for nodes with > no memory. > > And in fact the comment says: > > * Used for initializing percpu 'numa_mem' > > Which is what we do: > > set_numa_mem(local_memory_node(numa_cpu_lookup_table[cpu])); > > And is what's returned by node_to_mem_node(): > > static inline void set_numa_mem(int node) > { > this_cpu_write(_numa_mem_, node); > _node_numa_mem_[numa_node_id()] = node; > } > > static inline int node_to_mem_node(int node) > { > return _node_numa_mem_[node]; > } > > So your change effectively ends up doing: > > return local_memory_node(local_memory_node(cpu_to_node(i))); > > Which doesn't look right. > > > cheers > >
diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c index 9f8cffc..a27a31f 100644 --- a/block/blk-mq-cpumap.c +++ b/block/blk-mq-cpumap.c @@ -73,7 +73,8 @@ int blk_mq_hw_queue_to_node(unsigned int *mq_map, unsigned int index) for_each_possible_cpu(i) { if (index == mq_map[i]) - return local_memory_node(cpu_to_node(i)); + return local_memory_node( + node_to_mem_node(cpu_to_node(i))); } return NUMA_NO_NODE; diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 77e4d3c..e7aaa2a 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4188,6 +4188,7 @@ struct page * gfp_mask &= gfp_allowed_mask; alloc_mask = gfp_mask; + preferred_nid = node_to_mem_node(preferred_nid); if (!prepare_alloc_pages(gfp_mask, order, preferred_nid, nodemask, &ac, &alloc_mask, &alloc_flags)) return NULL;
pseries/findnodes: On pseries systems which allow 'hot-add' of resources, we may boot configurations that have CPUs, but no memory associated to a node by the affinity calculations. Previously, the software took a shortcut to collapse initialization and references to such memoryless nodes with other nodes that did have memory associated with them at boot. This patch is based on fixes that allow the proper initialization and distinguishment of memoryless and memory-plus nodes after NUMA initialization. It extends the use of the 'node_to_mem_node()' API from 'topology.h' to modules that are allocating node-specific memory at boot, and allows such references to find available memory in another node. Signed-off-by: Michael Bringmann <mwb@linux.vnet.ibm.com> --- block/blk-mq-cpumap.c | 3 ++- mm/page_alloc.c | 1 + 2 files changed, 3 insertions(+), 1 deletion(-)