From patchwork Wed Nov 1 18:58:19 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Bringmann X-Patchwork-Id: 833092 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3yRyNj1gHkz9t2V for ; Thu, 2 Nov 2017 06:05:57 +1100 (AEDT) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3yRyNh6sxnzDr5w for ; Thu, 2 Nov 2017 06:05:56 +1100 (AEDT) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=linux.vnet.ibm.com (client-ip=148.163.158.5; helo=mx0a-001b2d01.pphosted.com; envelope-from=mwb@linux.vnet.ibm.com; receiver=) Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3yRyD32Nd4zDr6C for ; Thu, 2 Nov 2017 05:58:26 +1100 (AEDT) Received: from pps.filterd (m0098416.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id vA1IusUr014528 for ; Wed, 1 Nov 2017 14:58:24 -0400 Received: from e17.ny.us.ibm.com (e17.ny.us.ibm.com [129.33.205.207]) by mx0b-001b2d01.pphosted.com with ESMTP id 2dyh36sq77-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Wed, 01 Nov 2017 14:58:24 -0400 Received: from localhost by e17.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 1 Nov 2017 14:58:23 -0400 Received: from b01cxnp23033.gho.pok.ibm.com (9.57.198.28) by e17.ny.us.ibm.com (146.89.104.204) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Wed, 1 Nov 2017 14:58:21 -0400 Received: from b01ledav005.gho.pok.ibm.com (b01ledav005.gho.pok.ibm.com [9.57.199.110]) by b01cxnp23033.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id vA1IwKcC1835500; Wed, 1 Nov 2017 18:58:20 GMT Received: from b01ledav005.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id D0DB1AE03B; Wed, 1 Nov 2017 14:59:06 -0400 (EDT) Received: from oc1554177480.ibm.com (unknown [9.53.92.194]) by b01ledav005.gho.pok.ibm.com (Postfix) with ESMTP id 33EE2AE04E; Wed, 1 Nov 2017 14:59:06 -0400 (EDT) To: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org From: Michael Bringmann Organization: IBM Linux Technology Center Subject: [PATCH V6 1/2] pseries/nodes: Ensure enough nodes avail for operations In-Reply-To: Date: Wed, 1 Nov 2017 13:58:19 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 Content-Language: en-US X-TM-AS-GCONF: 00 x-cbid: 17110118-0040-0000-0000-000003BBDBB8 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007993; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000239; SDB=6.00939708; UDB=6.00473792; IPR=6.00719969; BA=6.00005666; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00017824; XFM=3.00000015; UTC=2017-11-01 18:58:22 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17110118-0041-0000-0000-000007B0EEF1 Message-Id: X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-11-01_05:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000 definitions=main-1711010247 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.24 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Thomas Falcon , John Allen , Michael Bringmann , Tyrel Datwyler , Nathan Fontenot Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" pseries/nodes: On pseries systems which allow 'hot-add' of CPU or memory resources, it may occur that the new resources are to be inserted into nodes that were not used for these resources at bootup. In the kernel, any node that is used must be defined and initialized. This patch ensures that sufficient nodes are defined to support configuration requirements after boot, as well as at boot. This patch extracts the value of the lowest domain level (number of allocable resources) from the device tree property "ibm,max-associativity-domains" to use as the maximum number of nodes to setup as possibly available in the system. This new setting will override the instruction, nodes_and(node_possible_map, node_possible_map, node_online_map); presently seen in the function arch/powerpc/mm/numa.c:initmem_init(). If the property is not present at boot, no operation will be performed to define or enable additional nodes. Signed-off-by: Michael Bringmann --- Changes in V6: -- Remove some node initialization/allocation from boot setup --- arch/powerpc/mm/numa.c | 40 +++++++++++++++++++++++++++++++++++++--- 1 file changed, 37 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c index eb604b3..334a1ff 100644 --- a/arch/powerpc/mm/numa.c +++ b/arch/powerpc/mm/numa.c @@ -892,6 +892,37 @@ static void __init setup_node_data(int nid, u64 start_pfn, u64 end_pfn) NODE_DATA(nid)->node_spanned_pages = spanned_pages; } +static void __init find_possible_nodes(void) +{ + struct device_node *rtas; + u32 numnodes, i; + + if (min_common_depth <= 0) + return; + + rtas = of_find_node_by_path("/rtas"); + if (!rtas) + return; + + if (of_property_read_u32_index(rtas, + "ibm,max-associativity-domains", + min_common_depth, &numnodes)) + goto out; + + pr_info("numa: Nodes = %d (mcd = %d)\n", numnodes, + min_common_depth); + + for (i = 0; i < numnodes; i++) { + if (!node_possible(i)) { + setup_node_data(i, 0, 0); + node_set(i, node_possible_map); + } + } + +out: + of_node_put(rtas); +} + void __init initmem_init(void) { int nid, cpu; @@ -905,12 +936,15 @@ void __init initmem_init(void) memblock_dump_all(); /* - * Reduce the possible NUMA nodes to the online NUMA nodes, - * since we do not support node hotplug. This ensures that we - * lower the maximum NUMA node ID to what is actually present. + * Modify the set of possible NUMA nodes to reflect information + * available about the set of online nodes, and the set of nodes + * that we expect to make use of for this platform's affinity + * calculations. */ nodes_and(node_possible_map, node_possible_map, node_online_map); + find_possible_nodes(); + for_each_online_node(nid) { unsigned long start_pfn, end_pfn; From patchwork Wed Nov 1 18:58:23 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Bringmann X-Patchwork-Id: 833093 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3yRyR11ZqMz9sNw for ; Thu, 2 Nov 2017 06:07:57 +1100 (AEDT) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3yRyR10P4kzDrF7 for ; Thu, 2 Nov 2017 06:07:57 +1100 (AEDT) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=linux.vnet.ibm.com (client-ip=148.163.156.1; helo=mx0a-001b2d01.pphosted.com; envelope-from=mwb@linux.vnet.ibm.com; receiver=) Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3yRyD81swnzDr6C for ; Thu, 2 Nov 2017 05:58:32 +1100 (AEDT) Received: from pps.filterd (m0098404.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id vA1Iutkd028799 for ; Wed, 1 Nov 2017 14:58:29 -0400 Received: from e17.ny.us.ibm.com (e17.ny.us.ibm.com [129.33.205.207]) by mx0a-001b2d01.pphosted.com with ESMTP id 2dyh3h1mmt-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Wed, 01 Nov 2017 14:58:28 -0400 Received: from localhost by e17.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 1 Nov 2017 14:58:27 -0400 Received: from b01cxnp22035.gho.pok.ibm.com (9.57.198.25) by e17.ny.us.ibm.com (146.89.104.204) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Wed, 1 Nov 2017 14:58:25 -0400 Received: from b01ledav005.gho.pok.ibm.com (b01ledav005.gho.pok.ibm.com [9.57.199.110]) by b01cxnp22035.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id vA1IwPDR50200588; Wed, 1 Nov 2017 18:58:25 GMT Received: from b01ledav005.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 34C2AAE034; Wed, 1 Nov 2017 14:59:11 -0400 (EDT) Received: from oc1554177480.ibm.com (unknown [9.53.92.194]) by b01ledav005.gho.pok.ibm.com (Postfix) with ESMTP id 82ACEAE04E; Wed, 1 Nov 2017 14:59:10 -0400 (EDT) To: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org From: Michael Bringmann Organization: IBM Linux Technology Center Subject: [PATCH V6 2/2] pseries/initnodes: Ensure nodes initialized for hotplug In-Reply-To: Date: Wed, 1 Nov 2017 13:58:23 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 Content-Language: en-US X-TM-AS-GCONF: 00 x-cbid: 17110118-0040-0000-0000-000003BBDBBA X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007993; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000239; SDB=6.00939708; UDB=6.00473792; IPR=6.00719969; BA=6.00005666; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00017824; XFM=3.00000015; UTC=2017-11-01 18:58:27 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17110118-0041-0000-0000-000007B0EEF3 Message-Id: <129c79ac-3fc6-95c4-1eef-43f862c510c3@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-11-01_05:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 lowpriorityscore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000 definitions=main-1711010247 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.24 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Thomas Falcon , John Allen , Michael Bringmann , Tyrel Datwyler , Nathan Fontenot Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" pseries/nodes: On pseries systems which allow 'hot-add' of CPU, it may occur that the new resources are to be inserted into nodes that were not used for memory resources at bootup. Many different configurations of PowerPC resources may need to be supported depending upon the environment. This patch fixes some problems encountered at runtime with configurations that support memory-less nodes, or that hot-add CPUs into nodes that are memoryless during system execution after boot. Signed-off-by: Michael Bringmann --- Changes in V6: -- Add some needed node initialization to runtime code that maps CPUs based on VPHN associativity -- Add error checks and alternate recovery for compile flag CONFIG_MEMORY_HOTPLUG -- Add alternate node selection recovery for !CONFIG_MEMORY_HOTPLUG --- arch/powerpc/mm/numa.c | 51 ++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 40 insertions(+), 11 deletions(-) diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c index 334a1ff..163f4cc 100644 --- a/arch/powerpc/mm/numa.c +++ b/arch/powerpc/mm/numa.c @@ -551,7 +551,7 @@ static int numa_setup_cpu(unsigned long lcpu) nid = of_node_to_nid_single(cpu); out_present: - if (nid < 0 || !node_online(nid)) + if (nid < 0 || !node_possible(nid)) nid = first_online_node; map_cpu_to_node(lcpu, nid); @@ -867,7 +867,7 @@ void __init dump_numa_cpu_topology(void) } /* Initialize NODE_DATA for a node on the local memory */ -static void __init setup_node_data(int nid, u64 start_pfn, u64 end_pfn) +static void setup_node_data(int nid, u64 start_pfn, u64 end_pfn) { u64 spanned_pages = end_pfn - start_pfn; const size_t nd_size = roundup(sizeof(pg_data_t), SMP_CACHE_BYTES); @@ -913,10 +913,8 @@ static void __init find_possible_nodes(void) min_common_depth); for (i = 0; i < numnodes; i++) { - if (!node_possible(i)) { - setup_node_data(i, 0, 0); + if (!node_possible(i)) node_set(i, node_possible_map); - } } out: @@ -1312,6 +1310,42 @@ static long vphn_get_associativity(unsigned long cpu, return rc; } +static inline int find_cpu_nid(int cpu) +{ + __be32 associativity[VPHN_ASSOC_BUFSIZE] = {0}; + int new_nid; + + /* Use associativity from first thread for all siblings */ + vphn_get_associativity(cpu, associativity); + new_nid = associativity_to_nid(associativity); + if (new_nid < 0 || !node_possible(new_nid)) + new_nid = first_online_node; + + if (NODE_DATA(new_nid) == NULL) { +#ifdef CONFIG_MEMORY_HOTPLUG + /* + * Need to ensure that NODE_DATA is initialized + * for a node from available memory (see + * memblock_alloc_try_nid). If unable to init + * the node, then default to nearest node that + * has memory installed. + */ + if (try_online_node(new_nid)) + new_nid = first_online_node; +#else + /* + * Default to using the nearest node that has + * memory installed. Otherwise, it would be + * necessary to patch the kernel MM code to deal + * with more memoryless-node error conditions. + */ + new_nid = first_online_node; +#endif + } + + return new_nid; +} + /* * Update the CPU maps and sysfs entries for a single CPU when its NUMA * characteristics change. This function doesn't perform any locking and is @@ -1379,7 +1413,6 @@ int numa_update_cpu_topology(bool cpus_locked) { unsigned int cpu, sibling, changed = 0; struct topology_update_data *updates, *ud; - __be32 associativity[VPHN_ASSOC_BUFSIZE] = {0}; cpumask_t updated_cpus; struct device *dev; int weight, new_nid, i = 0; @@ -1417,11 +1450,7 @@ int numa_update_cpu_topology(bool cpus_locked) continue; } - /* Use associativity from first thread for all siblings */ - vphn_get_associativity(cpu, associativity); - new_nid = associativity_to_nid(associativity); - if (new_nid < 0 || !node_online(new_nid)) - new_nid = first_online_node; + new_nid = find_cpu_nid(cpu); if (new_nid == numa_cpu_lookup_table[cpu]) { cpumask_andnot(&cpu_associativity_changes_mask,