From patchwork Fri Jan 5 11:05:20 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bharata B Rao X-Patchwork-Id: 856014 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3zChlZ3PfWz9s4q for ; Fri, 5 Jan 2018 22:10:02 +1100 (AEDT) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3zChlY6m4jzF0QL for ; Fri, 5 Jan 2018 22:10:01 +1100 (AEDT) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=linux.vnet.ibm.com (client-ip=148.163.156.1; helo=mx0a-001b2d01.pphosted.com; envelope-from=bharata@linux.vnet.ibm.com; receiver=) Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3zChfb5sY3zF0PR for ; Fri, 5 Jan 2018 22:05:43 +1100 (AEDT) Received: from pps.filterd (m0098399.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id w05B4vmk136575 for ; Fri, 5 Jan 2018 06:05:40 -0500 Received: from e06smtp10.uk.ibm.com (e06smtp10.uk.ibm.com [195.75.94.106]) by mx0a-001b2d01.pphosted.com with ESMTP id 2fa78u2jba-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Fri, 05 Jan 2018 06:05:40 -0500 Received: from localhost by e06smtp10.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 5 Jan 2018 11:05:38 -0000 Received: from b06cxnps4074.portsmouth.uk.ibm.com (9.149.109.196) by e06smtp10.uk.ibm.com (192.168.101.140) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Fri, 5 Jan 2018 11:05:36 -0000 Received: from d06av23.portsmouth.uk.ibm.com (d06av23.portsmouth.uk.ibm.com [9.149.105.59]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w05B5Zax46465224; Fri, 5 Jan 2018 11:05:35 GMT Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 28D77A404D; Fri, 5 Jan 2018 10:59:33 +0000 (GMT) Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8593FA4051; Fri, 5 Jan 2018 10:59:31 +0000 (GMT) Received: from bharata.in.ibm.com (unknown [9.77.82.95]) by d06av23.portsmouth.uk.ibm.com (Postfix) with ESMTP; Fri, 5 Jan 2018 10:59:31 +0000 (GMT) From: Bharata B Rao To: linuxppc-dev@lists.ozlabs.org Subject: [RFC FIX v1 1/2] powerpc: Discover radix availability before scanning the memory nodes Date: Fri, 5 Jan 2018 16:35:20 +0530 X-Mailer: git-send-email 2.7.4 In-Reply-To: <1515150321-24894-1-git-send-email-bharata@linux.vnet.ibm.com> References: <1515150321-24894-1-git-send-email-bharata@linux.vnet.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 18010511-0040-0000-0000-00000400EFF2 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18010511-0041-0000-0000-000026043AE6 Message-Id: <1515150321-24894-2-git-send-email-bharata@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2018-01-05_04:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1801050157 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.24 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: aneesh.kumar@linux.vnet.ibm.com, Bharata B Rao , nfont@linux.vnet.ibm.com, anton@samba.org, david@gibson.dropbear.id.au Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Currently device tree nodes for memory are scanned before the radix feature is discovered in mmu_early_init_devtree(). Move this routine ahead of scanning memory nodes so that we know if the guest is radix or not when scanning ibm,dynamic-reconfiguration-memory. Signed-off-by: Bharata B Rao --- arch/powerpc/kernel/prom.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c index b15bae2..079d893 100644 --- a/arch/powerpc/kernel/prom.c +++ b/arch/powerpc/kernel/prom.c @@ -722,6 +722,8 @@ void __init early_init_devtree(void *params) */ of_scan_flat_dt(early_init_dt_scan_chosen_ppc, boot_command_line); + mmu_early_init_devtree(); + /* Scan memory nodes and rebuild MEMBLOCKs */ of_scan_flat_dt(early_init_dt_scan_root, NULL); of_scan_flat_dt(early_init_dt_scan_memory_ppc, NULL); @@ -783,8 +785,6 @@ void __init early_init_devtree(void *params) spinning_secondaries = boot_cpu_count - 1; #endif - mmu_early_init_devtree(); - #ifdef CONFIG_PPC_POWERNV /* Scan and build the list of machine check recoverable ranges */ of_scan_flat_dt(early_init_dt_scan_recoverable_ranges, NULL); From patchwork Fri Jan 5 11:05:21 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bharata B Rao X-Patchwork-Id: 856015 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3zChp43xjpz9s4q for ; Fri, 5 Jan 2018 22:12:12 +1100 (AEDT) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3zChp42h7jzF0QW for ; Fri, 5 Jan 2018 22:12:12 +1100 (AEDT) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=linux.vnet.ibm.com (client-ip=148.163.158.5; helo=mx0a-001b2d01.pphosted.com; envelope-from=bharata@linux.vnet.ibm.com; receiver=) Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3zChff5CL5zF0PR for ; Fri, 5 Jan 2018 22:05:46 +1100 (AEDT) Received: from pps.filterd (m0098413.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id w05B5En6025628 for ; Fri, 5 Jan 2018 06:05:44 -0500 Received: from e06smtp13.uk.ibm.com (e06smtp13.uk.ibm.com [195.75.94.109]) by mx0b-001b2d01.pphosted.com with ESMTP id 2fa49k9veu-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Fri, 05 Jan 2018 06:05:43 -0500 Received: from localhost by e06smtp13.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 5 Jan 2018 11:05:41 -0000 Received: from b06cxnps3074.portsmouth.uk.ibm.com (9.149.109.194) by e06smtp13.uk.ibm.com (192.168.101.143) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Fri, 5 Jan 2018 11:05:39 -0000 Received: from d06av23.portsmouth.uk.ibm.com (d06av23.portsmouth.uk.ibm.com [9.149.105.59]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w05B5cQG57278544; Fri, 5 Jan 2018 11:05:38 GMT Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2AC1AA4059; Fri, 5 Jan 2018 10:59:36 +0000 (GMT) Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 880B5A4040; Fri, 5 Jan 2018 10:59:34 +0000 (GMT) Received: from bharata.in.ibm.com (unknown [9.77.82.95]) by d06av23.portsmouth.uk.ibm.com (Postfix) with ESMTP; Fri, 5 Jan 2018 10:59:34 +0000 (GMT) From: Bharata B Rao To: linuxppc-dev@lists.ozlabs.org Subject: [RFC FIX v1 2/2] powerpc: Fix memory unplug failure on radix guest Date: Fri, 5 Jan 2018 16:35:21 +0530 X-Mailer: git-send-email 2.7.4 In-Reply-To: <1515150321-24894-1-git-send-email-bharata@linux.vnet.ibm.com> References: <1515150321-24894-1-git-send-email-bharata@linux.vnet.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 18010511-0012-0000-0000-0000059FE652 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18010511-0013-0000-0000-0000191B37B4 Message-Id: <1515150321-24894-3-git-send-email-bharata@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2018-01-05_04:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1801050157 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.24 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: aneesh.kumar@linux.vnet.ibm.com, Bharata B Rao , nfont@linux.vnet.ibm.com, anton@samba.org, david@gibson.dropbear.id.au Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" For a PowerKVM guest, it is possible to explicitly specify a DIMM device in addition to the system RAM at boot time. When such a cold plugged DIMM device is removed from a radix guest, we hit the following warning in the guest kernel resulting in the eventual failure of memory unplug: remove_pud_table: unaligned range WARNING: CPU: 3 PID: 164 at arch/powerpc/mm/pgtable-radix.c:597 remove_pagetable+0x468/0xca0 Call Trace: remove_pagetable+0x464/0xca0 (unreliable) radix__remove_section_mapping+0x24/0x40 remove_section_mapping+0x28/0x60 arch_remove_memory+0xcc/0x120 remove_memory+0x1ac/0x270 dlpar_remove_lmb+0x1ac/0x210 dlpar_memory+0xbc4/0xeb0 pseries_hp_work_fn+0x1a4/0x230 process_one_work+0x1cc/0x660 worker_thread+0xac/0x6d0 kthread+0x16c/0x1b0 ret_from_kernel_thread+0x5c/0x74 The DIMM memory that is cold plugged gets merged to the same memblock region as RAM and hence gets mapped at 1G alignment. However since the removal is done for one LMB (lmb size 256MB) at a time, the address of the LMB (which is 256MB aligned) would get flagged as unaligned in remove_pud_table() resulting in the above failure. This problem is not seen for hot plugged memory because for the hot plugged memory, the mappings are created separately for each LMB and hence they all get aligned at 256MB. To fix this problem for the cold plugged memory, let us mark the cold plugged memblock region explicitly as hotplugged so that the region doesn't get merged with RAM. All the memory that is discovered via ibm,dynamic-reconfiguration-memory is marked so(1). Next identify such regions in radix_init_pgtable() and create separate mappings within that region for each LMB so that they get don't get aligned like RAM region at 1G (2). (1) The effect of marking the memory as hotplugged is that the marked memory falls into ZONE_MOVABLE if movable_node kernel command line option is enabled. This means no kernel allocations can occur from this memory. This should be reasonalble to expect for hotplugged memory but has an undesirable effect on PowerVM. On PowerVM, all the memory except RMA is represented via ibm,dynamic-reconfiguration-memory and hence we can't mark that entire memory as hotpluggable and movable. However since radix isn't supported on PowerVM, we make this marking conditional to radix so that PowerVM isn't affected. For PowerKVM guests, all boot time memory is represented via memory@XXXX nodes and hot plugged/pluggable memory is represented via ibm,dynamic-reconfiguration-memory property. We are marking all the memory that is in ASSIGNED state during boot as hotplugged. With this only cold plugged memory gets marked for PowerKVM. (2) To create separate mappings for every LMB in the hot plugged region, we need lmb-size. I am currently using memory_block_size_bytes() API to get the lmb-size. Since this is early init time code, the machine type isn't probed yet and hence memory_block_size_bytes() would return the default LMB size as 16MB. Hence we end up creating separate mappings at much lower granularity than what we can ideally do for pseries machine. Signed-off-by: Bharata B Rao --- arch/powerpc/kernel/prom.c | 2 ++ arch/powerpc/mm/pgtable-radix.c | 17 ++++++++++++++--- 2 files changed, 16 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c index 079d893..2ad8fb1 100644 --- a/arch/powerpc/kernel/prom.c +++ b/arch/powerpc/kernel/prom.c @@ -525,6 +525,8 @@ static int __init early_init_dt_scan_drconf_memory(unsigned long node) size = 0x80000000ul - base; } memblock_add(base, size); + if (early_radix_enabled()) + memblock_mark_hotplug(base, size); } while (--rngs); } memblock_dump_all(); diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c index cfbbee9..10ceced 100644 --- a/arch/powerpc/mm/pgtable-radix.c +++ b/arch/powerpc/mm/pgtable-radix.c @@ -17,6 +17,7 @@ #include #include #include +#include #include #include @@ -278,15 +279,25 @@ static void __init radix_init_pgtable(void) { unsigned long rts_field; struct memblock_region *reg; + phys_addr_t addr; + u64 lmb_size = memory_block_size_bytes(); /* We don't support slb for radix */ mmu_slb_size = 0; /* * Create the linear mapping, using standard page size for now */ - for_each_memblock(memory, reg) - WARN_ON(create_physical_mapping(reg->base, - reg->base + reg->size)); + for_each_memblock(memory, reg) { + if (memblock_is_hotpluggable(reg)) { + for (addr = reg->base; addr < (reg->base + reg->size); + addr += lmb_size) + WARN_ON(create_physical_mapping(addr, + addr + lmb_size)); + } else { + WARN_ON(create_physical_mapping(reg->base, + reg->base + reg->size)); + } + } /* Find out how many PID bits are supported */ if (cpu_has_feature(CPU_FTR_HVMODE)) {