From patchwork Wed Sep 20 20:17:03 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Tatashin X-Patchwork-Id: 816464 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3xyB7954w8z9ryv for ; Thu, 21 Sep 2017 06:24:53 +1000 (AEST) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3xyB793WmYzDr5W for ; Thu, 21 Sep 2017 06:24:53 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=oracle.com (client-ip=141.146.126.69; helo=aserp1040.oracle.com; envelope-from=pasha.tatashin@oracle.com; receiver=) Received: from aserp1040.oracle.com (aserp1040.oracle.com [141.146.126.69]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3xy9zq4sCdzDqSL for ; Thu, 21 Sep 2017 06:18:31 +1000 (AEST) Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id v8KKHNKO011313 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Sep 2017 20:17:23 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id v8KKHNYG020989 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Sep 2017 20:17:23 GMT Received: from abhmp0008.oracle.com (abhmp0008.oracle.com [141.146.116.14]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id v8KKHLfV007817; Wed, 20 Sep 2017 20:17:21 GMT Received: from xakep.us.oracle.com (/10.154.127.176) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 20 Sep 2017 13:17:20 -0700 From: Pavel Tatashin To: linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-arm-kernel@lists.infradead.org, x86@kernel.org, kasan-dev@googlegroups.com, borntraeger@de.ibm.com, heiko.carstens@de.ibm.com, davem@davemloft.net, willy@infradead.org, mhocko@kernel.org, ard.biesheuvel@linaro.org, mark.rutland@arm.com, will.deacon@arm.com, catalin.marinas@arm.com, sam@ravnborg.org, mgorman@techsingularity.net, steven.sistare@oracle.com, daniel.m.jordan@oracle.com, bob.picco@oracle.com Subject: [PATCH v9 01/12] x86/mm: setting fields in deferred pages Date: Wed, 20 Sep 2017 16:17:03 -0400 Message-Id: <20170920201714.19817-2-pasha.tatashin@oracle.com> X-Mailer: git-send-email 2.14.1 In-Reply-To: <20170920201714.19817-1-pasha.tatashin@oracle.com> References: <20170920201714.19817-1-pasha.tatashin@oracle.com> X-Source-IP: aserv0021.oracle.com [141.146.126.233] X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.24 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Without deferred struct page feature (CONFIG_DEFERRED_STRUCT_PAGE_INIT), flags and other fields in "struct page"es are never changed prior to first initializing struct pages by going through __init_single_page(). With deferred struct page feature enabled, however, we set fields in register_page_bootmem_info that are subsequently clobbered right after in free_all_bootmem: mem_init() { register_page_bootmem_info(); free_all_bootmem(); ... } When register_page_bootmem_info() is called only non-deferred struct pages are initialized. But, this function goes through some reserved pages which might be part of the deferred, and thus are not yet initialized. mem_init register_page_bootmem_info register_page_bootmem_info_node get_page_bootmem .. setting fields here .. such as: page->freelist = (void *)type; free_all_bootmem() free_low_memory_core_early() for_each_reserved_mem_region() reserve_bootmem_region() init_reserved_page() <- Only if this is deferred reserved page __init_single_pfn() __init_single_page() memset(0) <-- Loose the set fields here We end-up with issue where, currently we do not observe problem as memory is explicitly zeroed. But, if flag asserts are changed we can start hitting issues. Also, because in this patch series we will stop zeroing struct page memory during allocation, we must make sure that struct pages are properly initialized prior to using them. The deferred-reserved pages are initialized in free_all_bootmem(). Therefore, the fix is to switch the above calls. Signed-off-by: Pavel Tatashin Reviewed-by: Steven Sistare Reviewed-by: Daniel Jordan Reviewed-by: Bob Picco Acked-by: Michal Hocko --- arch/x86/mm/init_64.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index 5ea1c3c2636e..30fe22558720 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1182,12 +1182,17 @@ void __init mem_init(void) /* clear_bss() already clear the empty_zero_page */ - register_page_bootmem_info(); - /* this will put all memory onto the freelists */ free_all_bootmem(); after_bootmem = 1; + /* Must be done after boot memory is put on freelist, because here we + * might set fields in deferred struct pages that have not yet been + * initialized, and free_all_bootmem() initializes all the reserved + * deferred pages for us. + */ + register_page_bootmem_info(); + /* Register memory areas for /proc/kcore */ kclist_add(&kcore_vsyscall, (void *)VSYSCALL_ADDR, PAGE_SIZE, KCORE_OTHER); From patchwork Wed Sep 20 20:17:04 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Tatashin X-Patchwork-Id: 816465 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3xyB8x0pyLz9ryv for ; Thu, 21 Sep 2017 06:26:25 +1000 (AEST) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3xyB8w6gGpzDrJw for ; Thu, 21 Sep 2017 06:26:24 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=oracle.com (client-ip=156.151.31.81; helo=userp1040.oracle.com; envelope-from=pasha.tatashin@oracle.com; receiver=) Received: from userp1040.oracle.com (userp1040.oracle.com [156.151.31.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3xy9zt2rbDzDqYN for ; Thu, 21 Sep 2017 06:18:34 +1000 (AEST) Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by userp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id v8KKHO8J007673 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Sep 2017 20:17:24 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id v8KKHNRL029802 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Sep 2017 20:17:24 GMT Received: from abhmp0008.oracle.com (abhmp0008.oracle.com [141.146.116.14]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id v8KKHMfa028038; Wed, 20 Sep 2017 20:17:22 GMT Received: from xakep.us.oracle.com (/10.154.127.176) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 20 Sep 2017 13:17:22 -0700 From: Pavel Tatashin To: linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-arm-kernel@lists.infradead.org, x86@kernel.org, kasan-dev@googlegroups.com, borntraeger@de.ibm.com, heiko.carstens@de.ibm.com, davem@davemloft.net, willy@infradead.org, mhocko@kernel.org, ard.biesheuvel@linaro.org, mark.rutland@arm.com, will.deacon@arm.com, catalin.marinas@arm.com, sam@ravnborg.org, mgorman@techsingularity.net, steven.sistare@oracle.com, daniel.m.jordan@oracle.com, bob.picco@oracle.com Subject: [PATCH v9 02/12] sparc64/mm: setting fields in deferred pages Date: Wed, 20 Sep 2017 16:17:04 -0400 Message-Id: <20170920201714.19817-3-pasha.tatashin@oracle.com> X-Mailer: git-send-email 2.14.1 In-Reply-To: <20170920201714.19817-1-pasha.tatashin@oracle.com> References: <20170920201714.19817-1-pasha.tatashin@oracle.com> X-Source-IP: userv0022.oracle.com [156.151.31.74] X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.24 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Without deferred struct page feature (CONFIG_DEFERRED_STRUCT_PAGE_INIT), flags and other fields in "struct page"es are never changed prior to first initializing struct pages by going through __init_single_page(). With deferred struct page feature enabled there is a case where we set some fields prior to initializing: mem_init() { register_page_bootmem_info(); free_all_bootmem(); ... } When register_page_bootmem_info() is called only non-deferred struct pages are initialized. But, this function goes through some reserved pages which might be part of the deferred, and thus are not yet initialized. mem_init register_page_bootmem_info register_page_bootmem_info_node get_page_bootmem .. setting fields here .. such as: page->freelist = (void *)type; free_all_bootmem() free_low_memory_core_early() for_each_reserved_mem_region() reserve_bootmem_region() init_reserved_page() <- Only if this is deferred reserved page __init_single_pfn() __init_single_page() memset(0) <-- Loose the set fields here We end-up with similar issue as in the previous patch, where currently we do not observe problem as memory is zeroed. But, if flag asserts are changed we can start hitting issues. Also, because in this patch series we will stop zeroing struct page memory during allocation, we must make sure that struct pages are properly initialized prior to using them. The deferred-reserved pages are initialized in free_all_bootmem(). Therefore, the fix is to switch the above calls. Signed-off-by: Pavel Tatashin Reviewed-by: Steven Sistare Reviewed-by: Daniel Jordan Reviewed-by: Bob Picco Acked-by: David S. Miller Acked-by: Michal Hocko --- arch/sparc/mm/init_64.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c index 6034569e2c0d..310c6754bcaa 100644 --- a/arch/sparc/mm/init_64.c +++ b/arch/sparc/mm/init_64.c @@ -2548,9 +2548,15 @@ void __init mem_init(void) { high_memory = __va(last_valid_pfn << PAGE_SHIFT); - register_page_bootmem_info(); free_all_bootmem(); + /* Must be done after boot memory is put on freelist, because here we + * might set fields in deferred struct pages that have not yet been + * initialized, and free_all_bootmem() initializes all the reserved + * deferred pages for us. + */ + register_page_bootmem_info(); + /* * Set up the zero page, mark it reserved, so that page count * is not manipulated when freeing the page from user ptes. From patchwork Wed Sep 20 20:17:05 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Tatashin X-Patchwork-Id: 816470 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3xyBHW43ghz9s7v for ; Thu, 21 Sep 2017 06:32:07 +1000 (AEST) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3xyBHW35VtzDrSS for ; Thu, 21 Sep 2017 06:32:07 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=oracle.com (client-ip=156.151.31.81; helo=userp1040.oracle.com; envelope-from=pasha.tatashin@oracle.com; receiver=) Received: from userp1040.oracle.com (userp1040.oracle.com [156.151.31.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3xyB002DY7zDrCb for ; Thu, 21 Sep 2017 06:18:40 +1000 (AEST) Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by userp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id v8KKHPEi007690 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Sep 2017 20:17:26 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id v8KKHPNa029855 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Sep 2017 20:17:25 GMT Received: from abhmp0008.oracle.com (abhmp0008.oracle.com [141.146.116.14]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id v8KKHOf7028047; Wed, 20 Sep 2017 20:17:24 GMT Received: from xakep.us.oracle.com (/10.154.127.176) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 20 Sep 2017 13:17:24 -0700 From: Pavel Tatashin To: linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-arm-kernel@lists.infradead.org, x86@kernel.org, kasan-dev@googlegroups.com, borntraeger@de.ibm.com, heiko.carstens@de.ibm.com, davem@davemloft.net, willy@infradead.org, mhocko@kernel.org, ard.biesheuvel@linaro.org, mark.rutland@arm.com, will.deacon@arm.com, catalin.marinas@arm.com, sam@ravnborg.org, mgorman@techsingularity.net, steven.sistare@oracle.com, daniel.m.jordan@oracle.com, bob.picco@oracle.com Subject: [PATCH v9 03/12] mm: deferred_init_memmap improvements Date: Wed, 20 Sep 2017 16:17:05 -0400 Message-Id: <20170920201714.19817-4-pasha.tatashin@oracle.com> X-Mailer: git-send-email 2.14.1 In-Reply-To: <20170920201714.19817-1-pasha.tatashin@oracle.com> References: <20170920201714.19817-1-pasha.tatashin@oracle.com> X-Source-IP: userv0022.oracle.com [156.151.31.74] X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.24 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" This patch fixes two issues in deferred_init_memmap ===== In deferred_init_memmap() where all deferred struct pages are initialized we have a check like this: if (page->flags) { VM_BUG_ON(page_zone(page) != zone); goto free_range; } This way we are checking if the current deferred page has already been initialized. It works, because memory for struct pages has been zeroed, and the only way flags are not zero if it went through __init_single_page() before. But, once we change the current behavior and won't zero the memory in memblock allocator, we cannot trust anything inside "struct page"es until they are initialized. This patch fixes this. The deferred_init_memmap() is re-written to loop through only free memory ranges provided by memblock. ===== This patch fixes another existing issue on systems that have holes in zones i.e CONFIG_HOLES_IN_ZONE is defined. In for_each_mem_pfn_range() we have code like this: if (!pfn_valid_within(pfn) goto free_range; Note: 'page' is not set to NULL and is not incremented but 'pfn' advances. Thus means if deferred struct pages are enabled on systems with these kind of holes, linux would get memory corruptions. I have fixed this issue by defining a new macro that performs all the necessary operations when we free the current set of pages. Signed-off-by: Pavel Tatashin Reviewed-by: Steven Sistare Reviewed-by: Daniel Jordan Reviewed-by: Bob Picco --- mm/page_alloc.c | 161 +++++++++++++++++++++++++++----------------------------- 1 file changed, 78 insertions(+), 83 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index c841af88836a..d132c801d2c1 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1410,14 +1410,17 @@ void clear_zone_contiguous(struct zone *zone) } #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT -static void __init deferred_free_range(struct page *page, - unsigned long pfn, int nr_pages) +static void __init deferred_free_range(unsigned long pfn, + unsigned long nr_pages) { - int i; + struct page *page; + unsigned long i; - if (!page) + if (!nr_pages) return; + page = pfn_to_page(pfn); + /* Free a large naturally-aligned chunk if possible */ if (nr_pages == pageblock_nr_pages && (pfn & (pageblock_nr_pages - 1)) == 0) { @@ -1443,19 +1446,82 @@ static inline void __init pgdat_init_report_one_done(void) complete(&pgdat_init_all_done_comp); } +#define DEFERRED_FREE(nr_free, free_base_pfn, page) \ +({ \ + unsigned long nr = (nr_free); \ + \ + deferred_free_range((free_base_pfn), (nr)); \ + (free_base_pfn) = 0; \ + (nr_free) = 0; \ + page = NULL; \ + nr; \ +}) + +static unsigned long deferred_init_range(int nid, int zid, unsigned long pfn, + unsigned long end_pfn) +{ + struct mminit_pfnnid_cache nid_init_state = { }; + unsigned long nr_pgmask = pageblock_nr_pages - 1; + unsigned long free_base_pfn = 0; + unsigned long nr_pages = 0; + unsigned long nr_free = 0; + struct page *page = NULL; + + for (; pfn < end_pfn; pfn++) { + /* + * First we check if pfn is valid on architectures where it is + * possible to have holes within pageblock_nr_pages. On systems + * where it is not possible, this function is optimized out. + * + * Then, we check if a current large page is valid by only + * checking the validity of the head pfn. + * + * meminit_pfn_in_nid is checked on systems where pfns can + * interleave within a node: a pfn is between start and end + * of a node, but does not belong to this memory node. + * + * Finally, we minimize pfn page lookups and scheduler checks by + * performing it only once every pageblock_nr_pages. + */ + if (!pfn_valid_within(pfn)) { + nr_pages += DEFERRED_FREE(nr_free, free_base_pfn, page); + } else if (!(pfn & nr_pgmask) && !pfn_valid(pfn)) { + nr_pages += DEFERRED_FREE(nr_free, free_base_pfn, page); + } else if (!meminit_pfn_in_nid(pfn, nid, &nid_init_state)) { + nr_pages += DEFERRED_FREE(nr_free, free_base_pfn, page); + } else if (page && (pfn & nr_pgmask)) { + page++; + __init_single_page(page, pfn, zid, nid); + nr_free++; + } else { + nr_pages += DEFERRED_FREE(nr_free, free_base_pfn, page); + page = pfn_to_page(pfn); + __init_single_page(page, pfn, zid, nid); + free_base_pfn = pfn; + nr_free = 1; + cond_resched(); + } + } + /* Free the last block of pages to allocator */ + nr_pages += DEFERRED_FREE(nr_free, free_base_pfn, page); + + return nr_pages; +} + /* Initialise remaining memory on a node */ static int __init deferred_init_memmap(void *data) { pg_data_t *pgdat = data; int nid = pgdat->node_id; - struct mminit_pfnnid_cache nid_init_state = { }; unsigned long start = jiffies; unsigned long nr_pages = 0; - unsigned long walk_start, walk_end; - int i, zid; + unsigned long spfn, epfn; + phys_addr_t spa, epa; + int zid; struct zone *zone; unsigned long first_init_pfn = pgdat->first_deferred_pfn; const struct cpumask *cpumask = cpumask_of_node(pgdat->node_id); + u64 i; if (first_init_pfn == ULONG_MAX) { pgdat_init_report_one_done(); @@ -1477,83 +1543,12 @@ static int __init deferred_init_memmap(void *data) if (first_init_pfn < zone_end_pfn(zone)) break; } + first_init_pfn = max(zone->zone_start_pfn, first_init_pfn); - for_each_mem_pfn_range(i, nid, &walk_start, &walk_end, NULL) { - unsigned long pfn, end_pfn; - struct page *page = NULL; - struct page *free_base_page = NULL; - unsigned long free_base_pfn = 0; - int nr_to_free = 0; - - end_pfn = min(walk_end, zone_end_pfn(zone)); - pfn = first_init_pfn; - if (pfn < walk_start) - pfn = walk_start; - if (pfn < zone->zone_start_pfn) - pfn = zone->zone_start_pfn; - - for (; pfn < end_pfn; pfn++) { - if (!pfn_valid_within(pfn)) - goto free_range; - - /* - * Ensure pfn_valid is checked every - * pageblock_nr_pages for memory holes - */ - if ((pfn & (pageblock_nr_pages - 1)) == 0) { - if (!pfn_valid(pfn)) { - page = NULL; - goto free_range; - } - } - - if (!meminit_pfn_in_nid(pfn, nid, &nid_init_state)) { - page = NULL; - goto free_range; - } - - /* Minimise pfn page lookups and scheduler checks */ - if (page && (pfn & (pageblock_nr_pages - 1)) != 0) { - page++; - } else { - nr_pages += nr_to_free; - deferred_free_range(free_base_page, - free_base_pfn, nr_to_free); - free_base_page = NULL; - free_base_pfn = nr_to_free = 0; - - page = pfn_to_page(pfn); - cond_resched(); - } - - if (page->flags) { - VM_BUG_ON(page_zone(page) != zone); - goto free_range; - } - - __init_single_page(page, pfn, zid, nid); - if (!free_base_page) { - free_base_page = page; - free_base_pfn = pfn; - nr_to_free = 0; - } - nr_to_free++; - - /* Where possible, batch up pages for a single free */ - continue; -free_range: - /* Free the current block of pages to allocator */ - nr_pages += nr_to_free; - deferred_free_range(free_base_page, free_base_pfn, - nr_to_free); - free_base_page = NULL; - free_base_pfn = nr_to_free = 0; - } - /* Free the last block of pages to allocator */ - nr_pages += nr_to_free; - deferred_free_range(free_base_page, free_base_pfn, nr_to_free); - - first_init_pfn = max(end_pfn, first_init_pfn); + for_each_free_mem_range(i, nid, MEMBLOCK_NONE, &spa, &epa, NULL) { + spfn = max_t(unsigned long, first_init_pfn, PFN_UP(spa)); + epfn = min_t(unsigned long, zone_end_pfn(zone), PFN_DOWN(epa)); + nr_pages += deferred_init_range(nid, zid, spfn, epfn); } /* Sanity check that the next zone really is unpopulated */ From patchwork Wed Sep 20 20:17:06 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Tatashin X-Patchwork-Id: 816453 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3xyB2N129Vz9sPm for ; Thu, 21 Sep 2017 06:20:44 +1000 (AEST) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3xyB2L0mShzDqSL for ; Thu, 21 Sep 2017 06:20:42 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=oracle.com (client-ip=141.146.126.69; helo=aserp1040.oracle.com; envelope-from=pasha.tatashin@oracle.com; receiver=) Received: from aserp1040.oracle.com (aserp1040.oracle.com [141.146.126.69]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3xy9zp2bzvzDqSL for ; Thu, 21 Sep 2017 06:18:29 +1000 (AEST) Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id v8KKHR1O011377 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Sep 2017 20:17:28 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id v8KKHRi9021156 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Sep 2017 20:17:27 GMT Received: from abhmp0008.oracle.com (abhmp0008.oracle.com [141.146.116.14]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id v8KKHQ2q011250; Wed, 20 Sep 2017 20:17:26 GMT Received: from xakep.us.oracle.com (/10.154.127.176) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 20 Sep 2017 13:17:25 -0700 From: Pavel Tatashin To: linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-arm-kernel@lists.infradead.org, x86@kernel.org, kasan-dev@googlegroups.com, borntraeger@de.ibm.com, heiko.carstens@de.ibm.com, davem@davemloft.net, willy@infradead.org, mhocko@kernel.org, ard.biesheuvel@linaro.org, mark.rutland@arm.com, will.deacon@arm.com, catalin.marinas@arm.com, sam@ravnborg.org, mgorman@techsingularity.net, steven.sistare@oracle.com, daniel.m.jordan@oracle.com, bob.picco@oracle.com Subject: [PATCH v9 04/12] sparc64: simplify vmemmap_populate Date: Wed, 20 Sep 2017 16:17:06 -0400 Message-Id: <20170920201714.19817-5-pasha.tatashin@oracle.com> X-Mailer: git-send-email 2.14.1 In-Reply-To: <20170920201714.19817-1-pasha.tatashin@oracle.com> References: <20170920201714.19817-1-pasha.tatashin@oracle.com> X-Source-IP: aserv0022.oracle.com [141.146.126.234] X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.24 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Remove duplicating code by using common functions vmemmap_pud_populate and vmemmap_pgd_populate. Signed-off-by: Pavel Tatashin Reviewed-by: Steven Sistare Reviewed-by: Daniel Jordan Reviewed-by: Bob Picco Acked-by: David S. Miller Acked-by: Michal Hocko --- arch/sparc/mm/init_64.c | 23 ++++++----------------- 1 file changed, 6 insertions(+), 17 deletions(-) diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c index 310c6754bcaa..99aea4d15a5f 100644 --- a/arch/sparc/mm/init_64.c +++ b/arch/sparc/mm/init_64.c @@ -2651,30 +2651,19 @@ int __meminit vmemmap_populate(unsigned long vstart, unsigned long vend, vstart = vstart & PMD_MASK; vend = ALIGN(vend, PMD_SIZE); for (; vstart < vend; vstart += PMD_SIZE) { - pgd_t *pgd = pgd_offset_k(vstart); + pgd_t *pgd = vmemmap_pgd_populate(vstart, node); unsigned long pte; pud_t *pud; pmd_t *pmd; - if (pgd_none(*pgd)) { - pud_t *new = vmemmap_alloc_block(PAGE_SIZE, node); + if (!pgd) + return -ENOMEM; - if (!new) - return -ENOMEM; - pgd_populate(&init_mm, pgd, new); - } - - pud = pud_offset(pgd, vstart); - if (pud_none(*pud)) { - pmd_t *new = vmemmap_alloc_block(PAGE_SIZE, node); - - if (!new) - return -ENOMEM; - pud_populate(&init_mm, pud, new); - } + pud = vmemmap_pud_populate(pgd, vstart, node); + if (!pud) + return -ENOMEM; pmd = pmd_offset(pud, vstart); - pte = pmd_val(*pmd); if (!(pte & _PAGE_VALID)) { void *block = vmemmap_alloc_block(PMD_SIZE, node); From patchwork Wed Sep 20 20:17:07 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Tatashin X-Patchwork-Id: 816473 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3xyBLn4PPZz9s7v for ; Thu, 21 Sep 2017 06:34:57 +1000 (AEST) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3xyBLn3PJmzDrW0 for ; Thu, 21 Sep 2017 06:34:57 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=oracle.com (client-ip=156.151.31.81; helo=userp1040.oracle.com; envelope-from=pasha.tatashin@oracle.com; receiver=) Received: from userp1040.oracle.com (userp1040.oracle.com [156.151.31.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3xyB030P9vzDrCX for ; Thu, 21 Sep 2017 06:18:42 +1000 (AEST) Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by userp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id v8KKHSxD007777 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Sep 2017 20:17:29 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id v8KKHSlH029988 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Sep 2017 20:17:28 GMT Received: from abhmp0008.oracle.com (abhmp0008.oracle.com [141.146.116.14]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id v8KKHRrs007854; Wed, 20 Sep 2017 20:17:27 GMT Received: from xakep.us.oracle.com (/10.154.127.176) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 20 Sep 2017 13:17:27 -0700 From: Pavel Tatashin To: linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-arm-kernel@lists.infradead.org, x86@kernel.org, kasan-dev@googlegroups.com, borntraeger@de.ibm.com, heiko.carstens@de.ibm.com, davem@davemloft.net, willy@infradead.org, mhocko@kernel.org, ard.biesheuvel@linaro.org, mark.rutland@arm.com, will.deacon@arm.com, catalin.marinas@arm.com, sam@ravnborg.org, mgorman@techsingularity.net, steven.sistare@oracle.com, daniel.m.jordan@oracle.com, bob.picco@oracle.com Subject: [PATCH v9 05/12] mm: defining memblock_virt_alloc_try_nid_raw Date: Wed, 20 Sep 2017 16:17:07 -0400 Message-Id: <20170920201714.19817-6-pasha.tatashin@oracle.com> X-Mailer: git-send-email 2.14.1 In-Reply-To: <20170920201714.19817-1-pasha.tatashin@oracle.com> References: <20170920201714.19817-1-pasha.tatashin@oracle.com> X-Source-IP: userv0022.oracle.com [156.151.31.74] X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.24 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" * A new variant of memblock_virt_alloc_* allocations: memblock_virt_alloc_try_nid_raw() - Does not zero the allocated memory - Does not panic if request cannot be satisfied * optimize early system hash allocations Clients can call alloc_large_system_hash() with flag: HASH_ZERO to specify that memory that was allocated for system hash needs to be zeroed, otherwise the memory does not need to be zeroed, and client will initialize it. If memory does not need to be zero'd, call the new memblock_virt_alloc_raw() interface, and thus improve the boot performance. * debug for raw alloctor When CONFIG_DEBUG_VM is enabled, this patch sets all the memory that is returned by memblock_virt_alloc_try_nid_raw() to ones to ensure that no places excpect zeroed memory. Signed-off-by: Pavel Tatashin Reviewed-by: Steven Sistare Reviewed-by: Daniel Jordan Reviewed-by: Bob Picco Acked-by: Michal Hocko --- include/linux/bootmem.h | 27 ++++++++++++++++++++++ mm/memblock.c | 60 +++++++++++++++++++++++++++++++++++++++++++------ mm/page_alloc.c | 15 ++++++------- 3 files changed, 87 insertions(+), 15 deletions(-) diff --git a/include/linux/bootmem.h b/include/linux/bootmem.h index e223d91b6439..ea30b3987282 100644 --- a/include/linux/bootmem.h +++ b/include/linux/bootmem.h @@ -160,6 +160,9 @@ extern void *__alloc_bootmem_low_node(pg_data_t *pgdat, #define BOOTMEM_ALLOC_ANYWHERE (~(phys_addr_t)0) /* FIXME: Move to memblock.h at a point where we remove nobootmem.c */ +void *memblock_virt_alloc_try_nid_raw(phys_addr_t size, phys_addr_t align, + phys_addr_t min_addr, + phys_addr_t max_addr, int nid); void *memblock_virt_alloc_try_nid_nopanic(phys_addr_t size, phys_addr_t align, phys_addr_t min_addr, phys_addr_t max_addr, int nid); @@ -176,6 +179,14 @@ static inline void * __init memblock_virt_alloc( NUMA_NO_NODE); } +static inline void * __init memblock_virt_alloc_raw( + phys_addr_t size, phys_addr_t align) +{ + return memblock_virt_alloc_try_nid_raw(size, align, BOOTMEM_LOW_LIMIT, + BOOTMEM_ALLOC_ACCESSIBLE, + NUMA_NO_NODE); +} + static inline void * __init memblock_virt_alloc_nopanic( phys_addr_t size, phys_addr_t align) { @@ -257,6 +268,14 @@ static inline void * __init memblock_virt_alloc( return __alloc_bootmem(size, align, BOOTMEM_LOW_LIMIT); } +static inline void * __init memblock_virt_alloc_raw( + phys_addr_t size, phys_addr_t align) +{ + if (!align) + align = SMP_CACHE_BYTES; + return __alloc_bootmem_nopanic(size, align, BOOTMEM_LOW_LIMIT); +} + static inline void * __init memblock_virt_alloc_nopanic( phys_addr_t size, phys_addr_t align) { @@ -309,6 +328,14 @@ static inline void * __init memblock_virt_alloc_try_nid(phys_addr_t size, min_addr); } +static inline void * __init memblock_virt_alloc_try_nid_raw( + phys_addr_t size, phys_addr_t align, + phys_addr_t min_addr, phys_addr_t max_addr, int nid) +{ + return ___alloc_bootmem_node_nopanic(NODE_DATA(nid), size, align, + min_addr, max_addr); +} + static inline void * __init memblock_virt_alloc_try_nid_nopanic( phys_addr_t size, phys_addr_t align, phys_addr_t min_addr, phys_addr_t max_addr, int nid) diff --git a/mm/memblock.c b/mm/memblock.c index 91205780e6b1..1f299fb1eb08 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -1327,7 +1327,6 @@ static void * __init memblock_virt_alloc_internal( return NULL; done: ptr = phys_to_virt(alloc); - memset(ptr, 0, size); /* * The min_count is set to 0 so that bootmem allocated blocks @@ -1340,6 +1339,45 @@ static void * __init memblock_virt_alloc_internal( return ptr; } +/** + * memblock_virt_alloc_try_nid_raw - allocate boot memory block without zeroing + * memory and without panicking + * @size: size of memory block to be allocated in bytes + * @align: alignment of the region and block's size + * @min_addr: the lower bound of the memory region from where the allocation + * is preferred (phys address) + * @max_addr: the upper bound of the memory region from where the allocation + * is preferred (phys address), or %BOOTMEM_ALLOC_ACCESSIBLE to + * allocate only from memory limited by memblock.current_limit value + * @nid: nid of the free area to find, %NUMA_NO_NODE for any node + * + * Public function, provides additional debug information (including caller + * info), if enabled. Does not zero allocated memory, does not panic if request + * cannot be satisfied. + * + * RETURNS: + * Virtual address of allocated memory block on success, NULL on failure. + */ +void * __init memblock_virt_alloc_try_nid_raw( + phys_addr_t size, phys_addr_t align, + phys_addr_t min_addr, phys_addr_t max_addr, + int nid) +{ + void *ptr; + + memblock_dbg("%s: %llu bytes align=0x%llx nid=%d from=0x%llx max_addr=0x%llx %pF\n", + __func__, (u64)size, (u64)align, nid, (u64)min_addr, + (u64)max_addr, (void *)_RET_IP_); + + ptr = memblock_virt_alloc_internal(size, align, + min_addr, max_addr, nid); +#ifdef CONFIG_DEBUG_VM + if (ptr && size > 0) + memset(ptr, 0xff, size); +#endif + return ptr; +} + /** * memblock_virt_alloc_try_nid_nopanic - allocate boot memory block * @size: size of memory block to be allocated in bytes @@ -1351,8 +1389,8 @@ static void * __init memblock_virt_alloc_internal( * allocate only from memory limited by memblock.current_limit value * @nid: nid of the free area to find, %NUMA_NO_NODE for any node * - * Public version of _memblock_virt_alloc_try_nid_nopanic() which provides - * additional debug information (including caller info), if enabled. + * Public function, provides additional debug information (including caller + * info), if enabled. This function zeroes the allocated memory. * * RETURNS: * Virtual address of allocated memory block on success, NULL on failure. @@ -1362,11 +1400,17 @@ void * __init memblock_virt_alloc_try_nid_nopanic( phys_addr_t min_addr, phys_addr_t max_addr, int nid) { + void *ptr; + memblock_dbg("%s: %llu bytes align=0x%llx nid=%d from=0x%llx max_addr=0x%llx %pF\n", __func__, (u64)size, (u64)align, nid, (u64)min_addr, (u64)max_addr, (void *)_RET_IP_); - return memblock_virt_alloc_internal(size, align, min_addr, - max_addr, nid); + + ptr = memblock_virt_alloc_internal(size, align, + min_addr, max_addr, nid); + if (ptr) + memset(ptr, 0, size); + return ptr; } /** @@ -1380,7 +1424,7 @@ void * __init memblock_virt_alloc_try_nid_nopanic( * allocate only from memory limited by memblock.current_limit value * @nid: nid of the free area to find, %NUMA_NO_NODE for any node * - * Public panicking version of _memblock_virt_alloc_try_nid_nopanic() + * Public panicking version of memblock_virt_alloc_try_nid_nopanic() * which provides debug information (including caller info), if enabled, * and panics if the request can not be satisfied. * @@ -1399,8 +1443,10 @@ void * __init memblock_virt_alloc_try_nid( (u64)max_addr, (void *)_RET_IP_); ptr = memblock_virt_alloc_internal(size, align, min_addr, max_addr, nid); - if (ptr) + if (ptr) { + memset(ptr, 0, size); return ptr; + } panic("%s: Failed to allocate %llu bytes align=0x%llx nid=%d from=0x%llx max_addr=0x%llx\n", __func__, (u64)size, (u64)align, nid, (u64)min_addr, diff --git a/mm/page_alloc.c b/mm/page_alloc.c index d132c801d2c1..a8dbd405ed94 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -7299,18 +7299,17 @@ void *__init alloc_large_system_hash(const char *tablename, log2qty = ilog2(numentries); - /* - * memblock allocator returns zeroed memory already, so HASH_ZERO is - * currently not used when HASH_EARLY is specified. - */ gfp_flags = (flags & HASH_ZERO) ? GFP_ATOMIC | __GFP_ZERO : GFP_ATOMIC; do { size = bucketsize << log2qty; - if (flags & HASH_EARLY) - table = memblock_virt_alloc_nopanic(size, 0); - else if (hashdist) + if (flags & HASH_EARLY) { + if (flags & HASH_ZERO) + table = memblock_virt_alloc_nopanic(size, 0); + else + table = memblock_virt_alloc_raw(size, 0); + } else if (hashdist) { table = __vmalloc(size, gfp_flags, PAGE_KERNEL); - else { + } else { /* * If bucketsize is not a power-of-two, we may free * some pages at the end of hash table which From patchwork Wed Sep 20 20:17:08 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Tatashin X-Patchwork-Id: 816477 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3xyBSV4287z9s4s for ; Thu, 21 Sep 2017 06:39:54 +1000 (AEST) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3xyBSV3BRhzDrq8 for ; Thu, 21 Sep 2017 06:39:54 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=oracle.com (client-ip=156.151.31.81; helo=userp1040.oracle.com; envelope-from=pasha.tatashin@oracle.com; receiver=) Received: from userp1040.oracle.com (userp1040.oracle.com [156.151.31.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3xyB0725ZBzDrFM for ; Thu, 21 Sep 2017 06:18:47 +1000 (AEST) Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by userp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id v8KKHV5I007812 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Sep 2017 20:17:31 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id v8KKHU1C021357 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Sep 2017 20:17:30 GMT Received: from abhmp0008.oracle.com (abhmp0008.oracle.com [141.146.116.14]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id v8KKHTdu011261; Wed, 20 Sep 2017 20:17:29 GMT Received: from xakep.us.oracle.com (/10.154.127.176) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 20 Sep 2017 13:17:29 -0700 From: Pavel Tatashin To: linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-arm-kernel@lists.infradead.org, x86@kernel.org, kasan-dev@googlegroups.com, borntraeger@de.ibm.com, heiko.carstens@de.ibm.com, davem@davemloft.net, willy@infradead.org, mhocko@kernel.org, ard.biesheuvel@linaro.org, mark.rutland@arm.com, will.deacon@arm.com, catalin.marinas@arm.com, sam@ravnborg.org, mgorman@techsingularity.net, steven.sistare@oracle.com, daniel.m.jordan@oracle.com, bob.picco@oracle.com Subject: [PATCH v9 06/12] mm: zero struct pages during initialization Date: Wed, 20 Sep 2017 16:17:08 -0400 Message-Id: <20170920201714.19817-7-pasha.tatashin@oracle.com> X-Mailer: git-send-email 2.14.1 In-Reply-To: <20170920201714.19817-1-pasha.tatashin@oracle.com> References: <20170920201714.19817-1-pasha.tatashin@oracle.com> X-Source-IP: aserv0022.oracle.com [141.146.126.234] X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.24 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Add struct page zeroing as a part of initialization of other fields in __init_single_page(). This single thread performance collected on: Intel(R) Xeon(R) CPU E7-8895 v3 @ 2.60GHz with 1T of memory (268400646 pages in 8 nodes): BASE FIX sparse_init 11.244671836s 0.007199623s zone_sizes_init 4.879775891s 8.355182299s -------------------------- Total 16.124447727s 8.362381922s sparse_init is where memory for struct pages is zeroed, and the zeroing part is moved later in this patch into __init_single_page(), which is called from zone_sizes_init(). Signed-off-by: Pavel Tatashin Reviewed-by: Steven Sistare Reviewed-by: Daniel Jordan Reviewed-by: Bob Picco Acked-by: Michal Hocko --- include/linux/mm.h | 9 +++++++++ mm/page_alloc.c | 1 + 2 files changed, 10 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index f8c10d336e42..50b74d628243 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -94,6 +94,15 @@ extern int mmap_rnd_compat_bits __read_mostly; #define mm_forbids_zeropage(X) (0) #endif +/* + * On some architectures it is expensive to call memset() for small sizes. + * Those architectures should provide their own implementation of "struct page" + * zeroing by defining this macro in . + */ +#ifndef mm_zero_struct_page +#define mm_zero_struct_page(pp) ((void)memset((pp), 0, sizeof(struct page))) +#endif + /* * Default maximum number of active map areas, this limits the number of vmas * per mm struct. Users can overwrite this number by sysctl but there is a diff --git a/mm/page_alloc.c b/mm/page_alloc.c index a8dbd405ed94..4b630ee91430 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1170,6 +1170,7 @@ static void free_one_page(struct zone *zone, static void __meminit __init_single_page(struct page *page, unsigned long pfn, unsigned long zone, int nid) { + mm_zero_struct_page(page); set_page_links(page, zone, nid, pfn); init_page_count(page); page_mapcount_reset(page); From patchwork Wed Sep 20 20:17:09 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Tatashin X-Patchwork-Id: 816467 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3xyBBN50QWz9s7v for ; Thu, 21 Sep 2017 06:27:40 +1000 (AEST) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3xyBBN4CtPzDrSl for ; Thu, 21 Sep 2017 06:27:40 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=oracle.com (client-ip=156.151.31.81; helo=userp1040.oracle.com; envelope-from=pasha.tatashin@oracle.com; receiver=) Received: from userp1040.oracle.com (userp1040.oracle.com [156.151.31.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3xy9zt3xgzzDr5G for ; Thu, 21 Sep 2017 06:18:34 +1000 (AEST) Received: from userv0021.oracle.com (userv0021.oracle.com [156.151.31.71]) by userp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id v8KKHWON007866 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Sep 2017 20:17:32 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by userv0021.oracle.com (8.14.4/8.14.4) with ESMTP id v8KKHVTp028534 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Sep 2017 20:17:32 GMT Received: from abhmp0008.oracle.com (abhmp0008.oracle.com [141.146.116.14]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id v8KKHVDD028115; Wed, 20 Sep 2017 20:17:31 GMT Received: from xakep.us.oracle.com (/10.154.127.176) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 20 Sep 2017 13:17:30 -0700 From: Pavel Tatashin To: linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-arm-kernel@lists.infradead.org, x86@kernel.org, kasan-dev@googlegroups.com, borntraeger@de.ibm.com, heiko.carstens@de.ibm.com, davem@davemloft.net, willy@infradead.org, mhocko@kernel.org, ard.biesheuvel@linaro.org, mark.rutland@arm.com, will.deacon@arm.com, catalin.marinas@arm.com, sam@ravnborg.org, mgorman@techsingularity.net, steven.sistare@oracle.com, daniel.m.jordan@oracle.com, bob.picco@oracle.com Subject: [PATCH v9 07/12] sparc64: optimized struct page zeroing Date: Wed, 20 Sep 2017 16:17:09 -0400 Message-Id: <20170920201714.19817-8-pasha.tatashin@oracle.com> X-Mailer: git-send-email 2.14.1 In-Reply-To: <20170920201714.19817-1-pasha.tatashin@oracle.com> References: <20170920201714.19817-1-pasha.tatashin@oracle.com> X-Source-IP: userv0021.oracle.com [156.151.31.71] X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.24 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Add an optimized mm_zero_struct_page(), so struct page's are zeroed without calling memset(). We do eight to ten regular stores based on the size of struct page. Compiler optimizes out the conditions of switch() statement. SPARC-M6 with 15T of memory, single thread performance: BASE FIX OPTIMIZED_FIX bootmem_init 28.440467985s 2.305674818s 2.305161615s free_area_init_nodes 202.845901673s 225.343084508s 172.556506560s -------------------------------------------- Total 231.286369658s 227.648759326s 174.861668175s BASE: current linux FIX: This patch series without "optimized struct page zeroing" OPTIMIZED_FIX: This patch series including the current patch. bootmem_init() is where memory for struct pages is zeroed during allocation. Note, about two seconds in this function is a fixed time: it does not increase as memory is increased. Signed-off-by: Pavel Tatashin Reviewed-by: Steven Sistare Reviewed-by: Daniel Jordan Reviewed-by: Bob Picco Acked-by: David S. Miller --- arch/sparc/include/asm/pgtable_64.h | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/arch/sparc/include/asm/pgtable_64.h b/arch/sparc/include/asm/pgtable_64.h index 4fefe3762083..8ed478abc630 100644 --- a/arch/sparc/include/asm/pgtable_64.h +++ b/arch/sparc/include/asm/pgtable_64.h @@ -230,6 +230,36 @@ extern unsigned long _PAGE_ALL_SZ_BITS; extern struct page *mem_map_zero; #define ZERO_PAGE(vaddr) (mem_map_zero) +/* This macro must be updated when the size of struct page grows above 80 + * or reduces below 64. + * The idea that compiler optimizes out switch() statement, and only + * leaves clrx instructions + */ +#define mm_zero_struct_page(pp) do { \ + unsigned long *_pp = (void *)(pp); \ + \ + /* Check that struct page is either 64, 72, or 80 bytes */ \ + BUILD_BUG_ON(sizeof(struct page) & 7); \ + BUILD_BUG_ON(sizeof(struct page) < 64); \ + BUILD_BUG_ON(sizeof(struct page) > 80); \ + \ + switch (sizeof(struct page)) { \ + case 80: \ + _pp[9] = 0; /* fallthrough */ \ + case 72: \ + _pp[8] = 0; /* fallthrough */ \ + default: \ + _pp[7] = 0; \ + _pp[6] = 0; \ + _pp[5] = 0; \ + _pp[4] = 0; \ + _pp[3] = 0; \ + _pp[2] = 0; \ + _pp[1] = 0; \ + _pp[0] = 0; \ + } \ +} while (0) + /* PFNs are real physical page numbers. However, mem_map only begins to record * per-page information starting at pfn_base. This is to handle systems where * the first physical page in the machine is at some huge physical address, From patchwork Wed Sep 20 20:17:10 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Tatashin X-Patchwork-Id: 816472 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3xyBK3201Hz9s8J for ; Thu, 21 Sep 2017 06:33:27 +1000 (AEST) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3xyBK30BWnzDrpW for ; Thu, 21 Sep 2017 06:33:27 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=oracle.com (client-ip=156.151.31.81; helo=userp1040.oracle.com; envelope-from=pasha.tatashin@oracle.com; receiver=) Received: from userp1040.oracle.com (userp1040.oracle.com [156.151.31.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3xyB001f1lzDrCX for ; Thu, 21 Sep 2017 06:18:39 +1000 (AEST) Received: from userv0021.oracle.com (userv0021.oracle.com [156.151.31.71]) by userp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id v8KKHYrN007902 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Sep 2017 20:17:34 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by userv0021.oracle.com (8.14.4/8.14.4) with ESMTP id v8KKHX88028607 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Sep 2017 20:17:33 GMT Received: from abhmp0008.oracle.com (abhmp0008.oracle.com [141.146.116.14]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id v8KKHWxL028212; Wed, 20 Sep 2017 20:17:32 GMT Received: from xakep.us.oracle.com (/10.154.127.176) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 20 Sep 2017 13:17:32 -0700 From: Pavel Tatashin To: linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-arm-kernel@lists.infradead.org, x86@kernel.org, kasan-dev@googlegroups.com, borntraeger@de.ibm.com, heiko.carstens@de.ibm.com, davem@davemloft.net, willy@infradead.org, mhocko@kernel.org, ard.biesheuvel@linaro.org, mark.rutland@arm.com, will.deacon@arm.com, catalin.marinas@arm.com, sam@ravnborg.org, mgorman@techsingularity.net, steven.sistare@oracle.com, daniel.m.jordan@oracle.com, bob.picco@oracle.com Subject: [PATCH v9 08/12] mm: zero reserved and unavailable struct pages Date: Wed, 20 Sep 2017 16:17:10 -0400 Message-Id: <20170920201714.19817-9-pasha.tatashin@oracle.com> X-Mailer: git-send-email 2.14.1 In-Reply-To: <20170920201714.19817-1-pasha.tatashin@oracle.com> References: <20170920201714.19817-1-pasha.tatashin@oracle.com> X-Source-IP: userv0021.oracle.com [156.151.31.71] X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.24 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Some memory is reserved but unavailable: not present in memblock.memory (because not backed by physical pages), but present in memblock.reserved. Such memory has backing struct pages, but they are not initialized by going through __init_single_page(). In some cases these struct pages are accessed even if they do not contain any data. One example is page_to_pfn() might access page->flags if this is where section information is stored (CONFIG_SPARSEMEM, SECTION_IN_PAGE_FLAGS). Since, struct pages are zeroed in __init_single_page(), and not during allocation time, we must zero such struct pages explicitly. The patch involves adding a new memblock iterator: for_each_resv_unavail_range(i, p_start, p_end) Which iterates through reserved && !memory lists, and we zero struct pages explicitly by calling mm_zero_struct_page(). Signed-off-by: Pavel Tatashin Reviewed-by: Steven Sistare Reviewed-by: Daniel Jordan Reviewed-by: Bob Picco --- include/linux/memblock.h | 16 ++++++++++++++++ include/linux/mm.h | 6 ++++++ mm/page_alloc.c | 30 ++++++++++++++++++++++++++++++ 3 files changed, 52 insertions(+) diff --git a/include/linux/memblock.h b/include/linux/memblock.h index bae11c7e7bf3..bdd4268f9323 100644 --- a/include/linux/memblock.h +++ b/include/linux/memblock.h @@ -237,6 +237,22 @@ unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long max_pfn); for_each_mem_range_rev(i, &memblock.memory, &memblock.reserved, \ nid, flags, p_start, p_end, p_nid) +/** + * for_each_resv_unavail_range - iterate through reserved and unavailable memory + * @i: u64 used as loop variable + * @flags: pick from blocks based on memory attributes + * @p_start: ptr to phys_addr_t for start address of the range, can be %NULL + * @p_end: ptr to phys_addr_t for end address of the range, can be %NULL + * + * Walks over unavailabled but reserved (reserved && !memory) areas of memblock. + * Available as soon as memblock is initialized. + * Note: because this memory does not belong to any physical node, flags and + * nid arguments do not make sense and thus not exported as arguments. + */ +#define for_each_resv_unavail_range(i, p_start, p_end) \ + for_each_mem_range(i, &memblock.reserved, &memblock.memory, \ + NUMA_NO_NODE, MEMBLOCK_NONE, p_start, p_end, NULL) + static inline void memblock_set_region_flags(struct memblock_region *r, unsigned long flags) { diff --git a/include/linux/mm.h b/include/linux/mm.h index 50b74d628243..a7bba4ce79ba 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2010,6 +2010,12 @@ extern int __meminit __early_pfn_to_nid(unsigned long pfn, struct mminit_pfnnid_cache *state); #endif +#ifdef CONFIG_HAVE_MEMBLOCK +void zero_resv_unavail(void); +#else +static inline void zero_resv_unavail(void) {} +#endif + extern void set_dma_reserve(unsigned long new_dma_reserve); extern void memmap_init_zone(unsigned long, int, unsigned long, unsigned long, enum memmap_context); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 4b630ee91430..1d38d391dffd 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -6202,6 +6202,34 @@ void __paginginit free_area_init_node(int nid, unsigned long *zones_size, free_area_init_core(pgdat); } +#ifdef CONFIG_HAVE_MEMBLOCK +/* + * Only struct pages that are backed by physical memory are zeroed and + * initialized by going through __init_single_page(). But, there are some + * struct pages which are reserved in memblock allocator and their fields + * may be accessed (for example page_to_pfn() on some configuration accesses + * flags). We must explicitly zero those struct pages. + */ +void __paginginit zero_resv_unavail(void) +{ + phys_addr_t start, end; + unsigned long pfn; + u64 i, pgcnt; + + /* Loop through ranges that are reserved, but do not have reported + * physical memory backing. + */ + pgcnt = 0; + for_each_resv_unavail_range(i, &start, &end) { + for (pfn = PFN_DOWN(start); pfn < PFN_UP(end); pfn++) { + mm_zero_struct_page(pfn_to_page(pfn)); + pgcnt++; + } + } + pr_info("Reserved but unavailable: %lld pages", pgcnt); +} +#endif /* CONFIG_HAVE_MEMBLOCK */ + #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP #if MAX_NUMNODES > 1 @@ -6625,6 +6653,7 @@ void __init free_area_init_nodes(unsigned long *max_zone_pfn) node_set_state(nid, N_MEMORY); check_for_memory(pgdat, nid); } + zero_resv_unavail(); } static int __init cmdline_parse_core(char *p, unsigned long *core) @@ -6788,6 +6817,7 @@ void __init free_area_init(unsigned long *zones_size) { free_area_init_node(0, zones_size, __pa(PAGE_OFFSET) >> PAGE_SHIFT, NULL); + zero_resv_unavail(); } static int page_alloc_cpu_dead(unsigned int cpu) From patchwork Wed Sep 20 20:17:11 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Tatashin X-Patchwork-Id: 816468 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3xyBDF1x1Fz9s7v for ; Thu, 21 Sep 2017 06:29:17 +1000 (AEST) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3xyBDF11BkzDqXn for ; Thu, 21 Sep 2017 06:29:17 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=oracle.com (client-ip=156.151.31.81; helo=userp1040.oracle.com; envelope-from=pasha.tatashin@oracle.com; receiver=) Received: from userp1040.oracle.com (userp1040.oracle.com [156.151.31.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3xy9zx0kF2zDqYG for ; Thu, 21 Sep 2017 06:18:36 +1000 (AEST) Received: from userv0021.oracle.com (userv0021.oracle.com [156.151.31.71]) by userp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id v8KKHY3i007910 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Sep 2017 20:17:35 GMT Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by userv0021.oracle.com (8.14.4/8.14.4) with ESMTP id v8KKHYDU028642 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Sep 2017 20:17:34 GMT Received: from abhmp0008.oracle.com (abhmp0008.oracle.com [141.146.116.14]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id v8KKHY6a007266; Wed, 20 Sep 2017 20:17:34 GMT Received: from xakep.us.oracle.com (/10.154.127.176) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 20 Sep 2017 13:17:34 -0700 From: Pavel Tatashin To: linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-arm-kernel@lists.infradead.org, x86@kernel.org, kasan-dev@googlegroups.com, borntraeger@de.ibm.com, heiko.carstens@de.ibm.com, davem@davemloft.net, willy@infradead.org, mhocko@kernel.org, ard.biesheuvel@linaro.org, mark.rutland@arm.com, will.deacon@arm.com, catalin.marinas@arm.com, sam@ravnborg.org, mgorman@techsingularity.net, steven.sistare@oracle.com, daniel.m.jordan@oracle.com, bob.picco@oracle.com Subject: [PATCH v9 09/12] mm/kasan: kasan specific map populate function Date: Wed, 20 Sep 2017 16:17:11 -0400 Message-Id: <20170920201714.19817-10-pasha.tatashin@oracle.com> X-Mailer: git-send-email 2.14.1 In-Reply-To: <20170920201714.19817-1-pasha.tatashin@oracle.com> References: <20170920201714.19817-1-pasha.tatashin@oracle.com> X-Source-IP: userv0021.oracle.com [156.151.31.71] X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.24 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" During early boot, kasan uses vmemmap_populate() to establish its shadow memory. But, that interface is intended for struct pages use. Because of the current project, vmemmap won't be zeroed during allocation, but kasan expects that memory to be zeroed. We are adding a new kasan_map_populate() function to resolve this difference. Signed-off-by: Pavel Tatashin --- arch/arm64/include/asm/pgtable.h | 3 ++ include/linux/kasan.h | 2 ++ mm/kasan/kasan_init.c | 67 ++++++++++++++++++++++++++++++++++++++++ 3 files changed, 72 insertions(+) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index bc4e92337d16..d89713f04354 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -381,6 +381,9 @@ extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn, PUD_TYPE_TABLE) #endif +#define pmd_large(pmd) pmd_sect(pmd) +#define pud_large(pud) pud_sect(pud) + static inline void set_pmd(pmd_t *pmdp, pmd_t pmd) { *pmdp = pmd; diff --git a/include/linux/kasan.h b/include/linux/kasan.h index a5c7046f26b4..7e13df1722c2 100644 --- a/include/linux/kasan.h +++ b/include/linux/kasan.h @@ -78,6 +78,8 @@ size_t kasan_metadata_size(struct kmem_cache *cache); bool kasan_save_enable_multi_shot(void); void kasan_restore_multi_shot(bool enabled); +int __meminit kasan_map_populate(unsigned long start, unsigned long end, + int node); #else /* CONFIG_KASAN */ diff --git a/mm/kasan/kasan_init.c b/mm/kasan/kasan_init.c index 554e4c0f23a2..57a973f05f63 100644 --- a/mm/kasan/kasan_init.c +++ b/mm/kasan/kasan_init.c @@ -197,3 +197,70 @@ void __init kasan_populate_zero_shadow(const void *shadow_start, zero_p4d_populate(pgd, addr, next); } while (pgd++, addr = next, addr != end); } + +/* Creates mappings for kasan during early boot. The mapped memory is zeroed */ +int __meminit kasan_map_populate(unsigned long start, unsigned long end, + int node) +{ + unsigned long addr, pfn, next; + unsigned long long size; + pgd_t *pgd; + p4d_t *p4d; + pud_t *pud; + pmd_t *pmd; + pte_t *pte; + int ret; + + ret = vmemmap_populate(start, end, node); + /* + * We might have partially populated memory, so check for no entries, + * and zero only those that actually exist. + */ + for (addr = start; addr < end; addr = next) { + pgd = pgd_offset_k(addr); + if (pgd_none(*pgd)) { + next = pgd_addr_end(addr, end); + continue; + } + + p4d = p4d_offset(pgd, addr); + if (p4d_none(*p4d)) { + next = p4d_addr_end(addr, end); + continue; + } + + pud = pud_offset(p4d, addr); + if (pud_none(*pud)) { + next = pud_addr_end(addr, end); + continue; + } + if (pud_large(*pud)) { + /* This is PUD size page */ + next = pud_addr_end(addr, end); + size = PUD_SIZE; + pfn = pud_pfn(*pud); + } else { + pmd = pmd_offset(pud, addr); + if (pmd_none(*pmd)) { + next = pmd_addr_end(addr, end); + continue; + } + if (pmd_large(*pmd)) { + /* This is PMD size page */ + next = pmd_addr_end(addr, end); + size = PMD_SIZE; + pfn = pmd_pfn(*pmd); + } else { + pte = pte_offset_kernel(pmd, addr); + next = addr + PAGE_SIZE; + if (pte_none(*pte)) + continue; + /* This is base size page */ + size = PAGE_SIZE; + pfn = pte_pfn(*pte); + } + } + memset(phys_to_virt(PFN_PHYS(pfn)), 0, size); + } + return ret; +} From patchwork Wed Sep 20 20:17:12 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Tatashin X-Patchwork-Id: 816469 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3xyBFl6vVvz9s7v for ; Thu, 21 Sep 2017 06:30:35 +1000 (AEST) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3xyBFl673lzDrbf for ; Thu, 21 Sep 2017 06:30:35 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=oracle.com (client-ip=141.146.126.69; helo=aserp1040.oracle.com; envelope-from=pasha.tatashin@oracle.com; receiver=) Received: from aserp1040.oracle.com (aserp1040.oracle.com [141.146.126.69]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3xy9zy1xkwzDrFp for ; Thu, 21 Sep 2017 06:18:37 +1000 (AEST) Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id v8KKHaKR011493 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Sep 2017 20:17:37 GMT Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id v8KKHab9030177 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Sep 2017 20:17:36 GMT Received: from abhmp0008.oracle.com (abhmp0008.oracle.com [141.146.116.14]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id v8KKHZEX007276; Wed, 20 Sep 2017 20:17:35 GMT Received: from xakep.us.oracle.com (/10.154.127.176) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 20 Sep 2017 13:17:35 -0700 From: Pavel Tatashin To: linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-arm-kernel@lists.infradead.org, x86@kernel.org, kasan-dev@googlegroups.com, borntraeger@de.ibm.com, heiko.carstens@de.ibm.com, davem@davemloft.net, willy@infradead.org, mhocko@kernel.org, ard.biesheuvel@linaro.org, mark.rutland@arm.com, will.deacon@arm.com, catalin.marinas@arm.com, sam@ravnborg.org, mgorman@techsingularity.net, steven.sistare@oracle.com, daniel.m.jordan@oracle.com, bob.picco@oracle.com Subject: [PATCH v9 10/12] x86/kasan: use kasan_map_populate() Date: Wed, 20 Sep 2017 16:17:12 -0400 Message-Id: <20170920201714.19817-11-pasha.tatashin@oracle.com> X-Mailer: git-send-email 2.14.1 In-Reply-To: <20170920201714.19817-1-pasha.tatashin@oracle.com> References: <20170920201714.19817-1-pasha.tatashin@oracle.com> X-Source-IP: userv0022.oracle.com [156.151.31.74] X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.24 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" To optimize the performance of struct page initialization, vmemmap_populate() will no longer zero memory. Therefore, we must use a new interface to allocate and map kasan shadow memory, that also zeroes memory for us. Signed-off-by: Pavel Tatashin --- arch/x86/mm/kasan_init_64.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c index bc84b73684b7..2db95efd208e 100644 --- a/arch/x86/mm/kasan_init_64.c +++ b/arch/x86/mm/kasan_init_64.c @@ -23,7 +23,7 @@ static int __init map_range(struct range *range) start = (unsigned long)kasan_mem_to_shadow(pfn_to_kaddr(range->start)); end = (unsigned long)kasan_mem_to_shadow(pfn_to_kaddr(range->end)); - return vmemmap_populate(start, end, NUMA_NO_NODE); + return kasan_map_populate(start, end, NUMA_NO_NODE); } static void __init clear_pgds(unsigned long start, @@ -136,9 +136,9 @@ void __init kasan_init(void) kasan_mem_to_shadow((void *)PAGE_OFFSET + MAXMEM), kasan_mem_to_shadow((void *)__START_KERNEL_map)); - vmemmap_populate((unsigned long)kasan_mem_to_shadow(_stext), - (unsigned long)kasan_mem_to_shadow(_end), - NUMA_NO_NODE); + kasan_map_populate((unsigned long)kasan_mem_to_shadow(_stext), + (unsigned long)kasan_mem_to_shadow(_end), + NUMA_NO_NODE); kasan_populate_zero_shadow(kasan_mem_to_shadow((void *)MODULES_END), (void *)KASAN_SHADOW_END); From patchwork Wed Sep 20 20:17:13 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Tatashin X-Patchwork-Id: 816475 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3xyBQP33lrz9s4s for ; Thu, 21 Sep 2017 06:38:05 +1000 (AEST) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3xyBQP1YChzDrZ1 for ; Thu, 21 Sep 2017 06:38:05 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=oracle.com (client-ip=141.146.126.69; helo=aserp1040.oracle.com; envelope-from=pasha.tatashin@oracle.com; receiver=) Received: from aserp1040.oracle.com (aserp1040.oracle.com [141.146.126.69]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3xyB026tr3zDr5T for ; Thu, 21 Sep 2017 06:18:42 +1000 (AEST) Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id v8KKHdlY011533 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Sep 2017 20:17:39 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id v8KKHca7022194 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Sep 2017 20:17:39 GMT Received: from abhmp0008.oracle.com (abhmp0008.oracle.com [141.146.116.14]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id v8KKHbw1007930; Wed, 20 Sep 2017 20:17:37 GMT Received: from xakep.us.oracle.com (/10.154.127.176) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 20 Sep 2017 13:17:37 -0700 From: Pavel Tatashin To: linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-arm-kernel@lists.infradead.org, x86@kernel.org, kasan-dev@googlegroups.com, borntraeger@de.ibm.com, heiko.carstens@de.ibm.com, davem@davemloft.net, willy@infradead.org, mhocko@kernel.org, ard.biesheuvel@linaro.org, mark.rutland@arm.com, will.deacon@arm.com, catalin.marinas@arm.com, sam@ravnborg.org, mgorman@techsingularity.net, steven.sistare@oracle.com, daniel.m.jordan@oracle.com, bob.picco@oracle.com Subject: [PATCH v9 11/12] arm64/kasan: use kasan_map_populate() Date: Wed, 20 Sep 2017 16:17:13 -0400 Message-Id: <20170920201714.19817-12-pasha.tatashin@oracle.com> X-Mailer: git-send-email 2.14.1 In-Reply-To: <20170920201714.19817-1-pasha.tatashin@oracle.com> References: <20170920201714.19817-1-pasha.tatashin@oracle.com> X-Source-IP: aserv0021.oracle.com [141.146.126.233] X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.24 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" To optimize the performance of struct page initialization, vmemmap_populate() will no longer zero memory. Therefore, we must use a new interface to allocate and map kasan shadow memory, that also zeroes memory for us. Signed-off-by: Pavel Tatashin --- arch/arm64/mm/kasan_init.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c index 81f03959a4ab..b6e92cfa3ea3 100644 --- a/arch/arm64/mm/kasan_init.c +++ b/arch/arm64/mm/kasan_init.c @@ -161,11 +161,11 @@ void __init kasan_init(void) clear_pgds(KASAN_SHADOW_START, KASAN_SHADOW_END); - vmemmap_populate(kimg_shadow_start, kimg_shadow_end, - pfn_to_nid(virt_to_pfn(lm_alias(_text)))); + kasan_map_populate(kimg_shadow_start, kimg_shadow_end, + pfn_to_nid(virt_to_pfn(lm_alias(_text)))); /* - * vmemmap_populate() has populated the shadow region that covers the + * kasan_map_populate() has populated the shadow region that covers the * kernel image with SWAPPER_BLOCK_SIZE mappings, so we have to round * the start and end addresses to SWAPPER_BLOCK_SIZE as well, to prevent * kasan_populate_zero_shadow() from replacing the page table entries @@ -191,9 +191,9 @@ void __init kasan_init(void) if (start >= end) break; - vmemmap_populate((unsigned long)kasan_mem_to_shadow(start), - (unsigned long)kasan_mem_to_shadow(end), - pfn_to_nid(virt_to_pfn(start))); + kasan_map_populate((unsigned long)kasan_mem_to_shadow(start), + (unsigned long)kasan_mem_to_shadow(end), + pfn_to_nid(virt_to_pfn(start))); } /* From patchwork Wed Sep 20 20:17:14 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Tatashin X-Patchwork-Id: 816474 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3xyBNg43q1z9s4s for ; Thu, 21 Sep 2017 06:36:35 +1000 (AEST) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3xyBNg2jQLzDrpj for ; Thu, 21 Sep 2017 06:36:35 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=oracle.com (client-ip=141.146.126.69; helo=aserp1040.oracle.com; envelope-from=pasha.tatashin@oracle.com; receiver=) Received: from aserp1040.oracle.com (aserp1040.oracle.com [141.146.126.69]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3xyB026tpxzDr5G for ; Thu, 21 Sep 2017 06:18:42 +1000 (AEST) Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id v8KKHdr5011538 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Sep 2017 20:17:39 GMT Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id v8KKHdPi021706 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Sep 2017 20:17:39 GMT Received: from abhmp0008.oracle.com (abhmp0008.oracle.com [141.146.116.14]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id v8KKHdaA007295; Wed, 20 Sep 2017 20:17:39 GMT Received: from xakep.us.oracle.com (/10.154.127.176) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 20 Sep 2017 13:17:39 -0700 From: Pavel Tatashin To: linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-arm-kernel@lists.infradead.org, x86@kernel.org, kasan-dev@googlegroups.com, borntraeger@de.ibm.com, heiko.carstens@de.ibm.com, davem@davemloft.net, willy@infradead.org, mhocko@kernel.org, ard.biesheuvel@linaro.org, mark.rutland@arm.com, will.deacon@arm.com, catalin.marinas@arm.com, sam@ravnborg.org, mgorman@techsingularity.net, steven.sistare@oracle.com, daniel.m.jordan@oracle.com, bob.picco@oracle.com Subject: [PATCH v9 12/12] mm: stop zeroing memory during allocation in vmemmap Date: Wed, 20 Sep 2017 16:17:14 -0400 Message-Id: <20170920201714.19817-13-pasha.tatashin@oracle.com> X-Mailer: git-send-email 2.14.1 In-Reply-To: <20170920201714.19817-1-pasha.tatashin@oracle.com> References: <20170920201714.19817-1-pasha.tatashin@oracle.com> X-Source-IP: aserv0022.oracle.com [141.146.126.234] X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.24 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" vmemmap_alloc_block() will no longer zero the block, so zero memory at its call sites for everything except struct pages. Struct page memory is zero'd by struct page initialization. Replace allocators in sprase-vmemmap to use the non-zeroing version. So, we will get the performance improvement by zeroing the memory in parallel when struct pages are zeroed. Signed-off-by: Pavel Tatashin Reviewed-by: Steven Sistare Reviewed-by: Daniel Jordan Reviewed-by: Bob Picco --- include/linux/mm.h | 11 +++++++++++ mm/sparse-vmemmap.c | 15 +++++++-------- mm/sparse.c | 6 +++--- 3 files changed, 21 insertions(+), 11 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index a7bba4ce79ba..25848764570f 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2501,6 +2501,17 @@ static inline void *vmemmap_alloc_block_buf(unsigned long size, int node) return __vmemmap_alloc_block_buf(size, node, NULL); } +static inline void *vmemmap_alloc_block_zero(unsigned long size, int node) +{ + void *p = vmemmap_alloc_block(size, node); + + if (!p) + return NULL; + memset(p, 0, size); + + return p; +} + void vmemmap_verify(pte_t *, int, unsigned long, unsigned long); int vmemmap_populate_basepages(unsigned long start, unsigned long end, int node); diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index d1a39b8051e0..c2f5654e7c9d 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -41,7 +41,7 @@ static void * __ref __earlyonly_bootmem_alloc(int node, unsigned long align, unsigned long goal) { - return memblock_virt_alloc_try_nid(size, align, goal, + return memblock_virt_alloc_try_nid_raw(size, align, goal, BOOTMEM_ALLOC_ACCESSIBLE, node); } @@ -54,9 +54,8 @@ void * __meminit vmemmap_alloc_block(unsigned long size, int node) if (slab_is_available()) { struct page *page; - page = alloc_pages_node(node, - GFP_KERNEL | __GFP_ZERO | __GFP_RETRY_MAYFAIL, - get_order(size)); + page = alloc_pages_node(node, GFP_KERNEL | __GFP_RETRY_MAYFAIL, + get_order(size)); if (page) return page_address(page); return NULL; @@ -183,7 +182,7 @@ pmd_t * __meminit vmemmap_pmd_populate(pud_t *pud, unsigned long addr, int node) { pmd_t *pmd = pmd_offset(pud, addr); if (pmd_none(*pmd)) { - void *p = vmemmap_alloc_block(PAGE_SIZE, node); + void *p = vmemmap_alloc_block_zero(PAGE_SIZE, node); if (!p) return NULL; pmd_populate_kernel(&init_mm, pmd, p); @@ -195,7 +194,7 @@ pud_t * __meminit vmemmap_pud_populate(p4d_t *p4d, unsigned long addr, int node) { pud_t *pud = pud_offset(p4d, addr); if (pud_none(*pud)) { - void *p = vmemmap_alloc_block(PAGE_SIZE, node); + void *p = vmemmap_alloc_block_zero(PAGE_SIZE, node); if (!p) return NULL; pud_populate(&init_mm, pud, p); @@ -207,7 +206,7 @@ p4d_t * __meminit vmemmap_p4d_populate(pgd_t *pgd, unsigned long addr, int node) { p4d_t *p4d = p4d_offset(pgd, addr); if (p4d_none(*p4d)) { - void *p = vmemmap_alloc_block(PAGE_SIZE, node); + void *p = vmemmap_alloc_block_zero(PAGE_SIZE, node); if (!p) return NULL; p4d_populate(&init_mm, p4d, p); @@ -219,7 +218,7 @@ pgd_t * __meminit vmemmap_pgd_populate(unsigned long addr, int node) { pgd_t *pgd = pgd_offset_k(addr); if (pgd_none(*pgd)) { - void *p = vmemmap_alloc_block(PAGE_SIZE, node); + void *p = vmemmap_alloc_block_zero(PAGE_SIZE, node); if (!p) return NULL; pgd_populate(&init_mm, pgd, p); diff --git a/mm/sparse.c b/mm/sparse.c index 83b3bf6461af..d22f51bb7c79 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -437,9 +437,9 @@ void __init sparse_mem_maps_populate_node(struct page **map_map, } size = PAGE_ALIGN(size); - map = memblock_virt_alloc_try_nid(size * map_count, - PAGE_SIZE, __pa(MAX_DMA_ADDRESS), - BOOTMEM_ALLOC_ACCESSIBLE, nodeid); + map = memblock_virt_alloc_try_nid_raw(size * map_count, + PAGE_SIZE, __pa(MAX_DMA_ADDRESS), + BOOTMEM_ALLOC_ACCESSIBLE, nodeid); if (map) { for (pnum = pnum_begin; pnum < pnum_end; pnum++) { if (!present_section_nr(pnum))