From patchwork Sat Feb 10 08:11:35 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 871640 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3zdlCG0ykpz9s4q for ; Sat, 10 Feb 2018 19:16:58 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="b3LO6j/L"; dkim-atps=neutral Received: from bilbo.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3zdlCF5TP5zDqZf for ; Sat, 10 Feb 2018 19:16:57 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="b3LO6j/L"; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:400e:c05::243; helo=mail-pg0-x243.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="b3LO6j/L"; dkim-atps=neutral Received: from mail-pg0-x243.google.com (mail-pg0-x243.google.com [IPv6:2607:f8b0:400e:c05::243]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3zdl5R2phfzF0fk for ; Sat, 10 Feb 2018 19:11:55 +1100 (AEDT) Received: by mail-pg0-x243.google.com with SMTP id l18so5050636pgc.5 for ; Sat, 10 Feb 2018 00:11:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=wJJlFO/hi66VF9QkVLcyeP5Hw3+aEU5CdZms5cLy2Ac=; b=b3LO6j/LMrsOqvzv+sVjy6cX9uNWH8dMp2XsEMuqdvsjAedAZiGO5oJjUW/9400dKL 8MzibCX8ZtwAZdZTttjf/UwRExuZ/AiXTdWZ+onv2ZSG3KJ3N+coD4yHXF9SmOKzYi9P zRHvIht9Mu0sbWRQhK9MYAZWtNGXQSI7aFPU2pCBoI1324aGuJHHIWq+yh/xBm7TOj8c mb+9b+LOwRCBRd74NdrqgS9FuVBj+369W7/hHQFkzV1CY1iGb1TtcAqySK/7PBdkmR0g 0P6wb7hploHLCOtX/4j8qIGn0+uQDW8owvLUKGZjrOZrtPi6z6f/kwXgQuj/k20LjPUR 0uuw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=wJJlFO/hi66VF9QkVLcyeP5Hw3+aEU5CdZms5cLy2Ac=; b=gvrzbgXA9oUQdn1UQFROZ4Qvs0MUsaV8+18q58DVYqRoh+hJ5+jiOgZWHHa9ipbYQM UTlAQtazpkR3+pMM+/yA+3+orCH2K56qTeD9+715n6uvkutK+E/Pv80OeIL7GWa/lzAi x0Tfp/GL8L0oNVevTj0hgaq7ZSUXS71nPsVh2wBmg6l7jaqHKy5etaaXmJG51MWHGAY2 nNpMjVGwJEnVQy8n1xfWulXi6nYgRpBgjZLG8FkrmMQG+MMH8eNigJboXgRUqvsdVjwE J5XYJplhWHnzrcrgTuWEG4ZDRG50QiTyey4ucuJx4aFwaUPy14D3LjYGwIJn29drAZS8 OcUw== X-Gm-Message-State: APf1xPAP3KXyni+9usG4H9C36cVq7EjUUZaaSNcf4FUdQRoDLti4EdhL GeTh9kA3rcMbWwrWkTAI0KYJOg== X-Google-Smtp-Source: AH8x2278eptdkxwhDmBi1sPS3KThCGjJbMzC9HkQueFsEplh7W7Qq1EYdAsAUac3S0AYRFYgMqI8TA== X-Received: by 10.99.126.19 with SMTP id z19mr4396932pgc.182.1518250312531; Sat, 10 Feb 2018 00:11:52 -0800 (PST) Received: from roar.au.ibm.com ([202.7.219.42]) by smtp.gmail.com with ESMTPSA id e67sm10776388pfd.23.2018.02.10.00.11.49 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 10 Feb 2018 00:11:51 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [RFC PATCH 1/5] powerpc/mm/slice: pass pointers to struct slice_mask where possible Date: Sat, 10 Feb 2018 18:11:35 +1000 Message-Id: <20180210081139.27236-2-npiggin@gmail.com> X-Mailer: git-send-email 2.15.1 In-Reply-To: <20180210081139.27236-1-npiggin@gmail.com> References: <20180210081139.27236-1-npiggin@gmail.com> X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Aneesh Kumar K . V" , Nicholas Piggin Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Pass around const pointers to struct slice_mask where possible, rather than copies of slice_mask, to reduce stack and call overhead. checkstack.pl gives, before: 0x00000de4 slice_get_unmapped_area [slice.o]: 656 0x00001b4c is_hugepage_only_range [slice.o]: 512 0x0000075c slice_find_area_topdown [slice.o]: 416 0x000004c8 slice_find_area_bottomup.isra.1 [slice.o]: 272 0x00001aa0 slice_set_range_psize [slice.o]: 240 0x00000a64 slice_find_area [slice.o]: 176 0x00000174 slice_check_fit [slice.o]: 112 after: 0x00000bd4 slice_get_unmapped_area [slice.o]: 496 0x000017cc is_hugepage_only_range [slice.o]: 352 0x00000758 slice_find_area [slice.o]: 144 0x00001750 slice_set_range_psize [slice.o]: 144 0x00000180 slice_check_fit [slice.o]: 128 0x000005b0 slice_find_area_bottomup.isra.2 [slice.o]: 128 Signed-off-by: Nicholas Piggin --- arch/powerpc/mm/slice.c | 83 +++++++++++++++++++++++++++---------------------- 1 file changed, 45 insertions(+), 38 deletions(-) diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c index 23ec2c5e3b78..e8f6922d3c9b 100644 --- a/arch/powerpc/mm/slice.c +++ b/arch/powerpc/mm/slice.c @@ -50,19 +50,21 @@ struct slice_mask { #ifdef DEBUG int _slice_debug = 1; -static void slice_print_mask(const char *label, struct slice_mask mask) +static void slice_print_mask(const char *label, const struct slice_mask *mask) { if (!_slice_debug) return; - pr_devel("%s low_slice: %*pbl\n", label, (int)SLICE_NUM_LOW, &mask.low_slices); - pr_devel("%s high_slice: %*pbl\n", label, (int)SLICE_NUM_HIGH, mask.high_slices); + pr_devel("%s low_slice: %*pbl\n", label, + (int)SLICE_NUM_LOW, &mask->low_slices); + pr_devel("%s high_slice: %*pbl\n", label, + (int)SLICE_NUM_HIGH, mask->high_slices); } #define slice_dbg(fmt...) do { if (_slice_debug) pr_devel(fmt); } while (0) #else -static void slice_print_mask(const char *label, struct slice_mask mask) {} +static void slice_print_mask(const char *label, const struct slice_mask *mask) {} #define slice_dbg(fmt...) #endif @@ -142,7 +144,8 @@ static void slice_mask_for_free(struct mm_struct *mm, struct slice_mask *ret, __set_bit(i, ret->high_slices); } -static void slice_mask_for_size(struct mm_struct *mm, int psize, struct slice_mask *ret, +static void slice_mask_for_size(struct mm_struct *mm, int psize, + struct slice_mask *ret, unsigned long high_limit) { unsigned char *hpsizes; @@ -171,7 +174,8 @@ static void slice_mask_for_size(struct mm_struct *mm, int psize, struct slice_ma } static int slice_check_fit(struct mm_struct *mm, - struct slice_mask mask, struct slice_mask available) + const struct slice_mask *mask, + const struct slice_mask *available) { DECLARE_BITMAP(result, SLICE_NUM_HIGH); /* @@ -180,11 +184,11 @@ static int slice_check_fit(struct mm_struct *mm, */ unsigned long slice_count = GET_HIGH_SLICE_INDEX(mm->context.slb_addr_limit); - bitmap_and(result, mask.high_slices, - available.high_slices, slice_count); + bitmap_and(result, mask->high_slices, + available->high_slices, slice_count); - return (mask.low_slices & available.low_slices) == mask.low_slices && - bitmap_equal(result, mask.high_slices, slice_count); + return (mask->low_slices & available->low_slices) == mask->low_slices && + bitmap_equal(result, mask->high_slices, slice_count); } static void slice_flush_segments(void *parm) @@ -202,7 +206,8 @@ static void slice_flush_segments(void *parm) local_irq_restore(flags); } -static void slice_convert(struct mm_struct *mm, struct slice_mask mask, int psize) +static void slice_convert(struct mm_struct *mm, + const struct slice_mask *mask, int psize) { int index, mask_index; /* Write the new slice psize bits */ @@ -220,7 +225,7 @@ static void slice_convert(struct mm_struct *mm, struct slice_mask mask, int psiz lpsizes = mm->context.low_slices_psize; for (i = 0; i < SLICE_NUM_LOW; i++) - if (mask.low_slices & (1u << i)) + if (mask->low_slices & (1u << i)) lpsizes = (lpsizes & ~(0xful << (i * 4))) | (((unsigned long)psize) << (i * 4)); @@ -231,7 +236,7 @@ static void slice_convert(struct mm_struct *mm, struct slice_mask mask, int psiz for (i = 0; i < GET_HIGH_SLICE_INDEX(mm->context.slb_addr_limit); i++) { mask_index = i & 0x1; index = i >> 1; - if (test_bit(i, mask.high_slices)) + if (test_bit(i, mask->high_slices)) hpsizes[index] = (hpsizes[index] & ~(0xf << (mask_index * 4))) | (((unsigned long)psize) << (mask_index * 4)); @@ -254,26 +259,25 @@ static void slice_convert(struct mm_struct *mm, struct slice_mask mask, int psiz * 'available' slice_mark. */ static bool slice_scan_available(unsigned long addr, - struct slice_mask available, - int end, - unsigned long *boundary_addr) + const struct slice_mask *available, + int end, unsigned long *boundary_addr) { unsigned long slice; if (addr < SLICE_LOW_TOP) { slice = GET_LOW_SLICE_INDEX(addr); *boundary_addr = (slice + end) << SLICE_LOW_SHIFT; - return !!(available.low_slices & (1u << slice)); + return !!(available->low_slices & (1u << slice)); } else { slice = GET_HIGH_SLICE_INDEX(addr); *boundary_addr = (slice + end) ? ((slice + end) << SLICE_HIGH_SHIFT) : SLICE_LOW_TOP; - return !!test_bit(slice, available.high_slices); + return !!test_bit(slice, available->high_slices); } } static unsigned long slice_find_area_bottomup(struct mm_struct *mm, unsigned long len, - struct slice_mask available, + const struct slice_mask *available, int psize, unsigned long high_limit) { int pshift = max_t(int, mmu_psize_defs[psize].shift, PAGE_SHIFT); @@ -319,7 +323,7 @@ static unsigned long slice_find_area_bottomup(struct mm_struct *mm, static unsigned long slice_find_area_topdown(struct mm_struct *mm, unsigned long len, - struct slice_mask available, + const struct slice_mask *available, int psize, unsigned long high_limit) { int pshift = max_t(int, mmu_psize_defs[psize].shift, PAGE_SHIFT); @@ -377,7 +381,7 @@ static unsigned long slice_find_area_topdown(struct mm_struct *mm, static unsigned long slice_find_area(struct mm_struct *mm, unsigned long len, - struct slice_mask mask, int psize, + const struct slice_mask *mask, int psize, int topdown, unsigned long high_limit) { if (topdown) @@ -386,7 +390,8 @@ static unsigned long slice_find_area(struct mm_struct *mm, unsigned long len, return slice_find_area_bottomup(mm, len, mask, psize, high_limit); } -static inline void slice_or_mask(struct slice_mask *dst, struct slice_mask *src) +static inline void slice_or_mask(struct slice_mask *dst, + const struct slice_mask *src) { DECLARE_BITMAP(result, SLICE_NUM_HIGH); @@ -395,7 +400,8 @@ static inline void slice_or_mask(struct slice_mask *dst, struct slice_mask *src) bitmap_copy(dst->high_slices, result, SLICE_NUM_HIGH); } -static inline void slice_andnot_mask(struct slice_mask *dst, struct slice_mask *src) +static inline void slice_andnot_mask(struct slice_mask *dst, + const struct slice_mask *src) { DECLARE_BITMAP(result, SLICE_NUM_HIGH); @@ -482,7 +488,7 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len, * already */ slice_mask_for_size(mm, psize, &good_mask, high_limit); - slice_print_mask(" good_mask", good_mask); + slice_print_mask(" good_mask", &good_mask); /* * Here "good" means slices that are already the right page size, @@ -516,12 +522,12 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len, if (addr != 0 || fixed) { /* Build a mask for the requested range */ slice_range_to_mask(addr, len, &mask); - slice_print_mask(" mask", mask); + slice_print_mask(" mask", &mask); /* Check if we fit in the good mask. If we do, we just return, * nothing else to do */ - if (slice_check_fit(mm, mask, good_mask)) { + if (slice_check_fit(mm, &mask, &good_mask)) { slice_dbg(" fits good !\n"); return addr; } @@ -529,7 +535,7 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len, /* Now let's see if we can find something in the existing * slices for that size */ - newaddr = slice_find_area(mm, len, good_mask, + newaddr = slice_find_area(mm, len, &good_mask, psize, topdown, high_limit); if (newaddr != -ENOMEM) { /* Found within the good mask, we don't have to setup, @@ -545,9 +551,10 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len, */ slice_mask_for_free(mm, &potential_mask, high_limit); slice_or_mask(&potential_mask, &good_mask); - slice_print_mask(" potential", potential_mask); + slice_print_mask(" potential", &potential_mask); - if ((addr != 0 || fixed) && slice_check_fit(mm, mask, potential_mask)) { + if ((addr != 0 || fixed) && + slice_check_fit(mm, &mask, &potential_mask)) { slice_dbg(" fits potential !\n"); goto convert; } @@ -562,7 +569,7 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len, * anywhere in the good area. */ if (addr) { - addr = slice_find_area(mm, len, good_mask, + addr = slice_find_area(mm, len, &good_mask, psize, topdown, high_limit); if (addr != -ENOMEM) { slice_dbg(" found area at 0x%lx\n", addr); @@ -573,14 +580,14 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len, /* Now let's see if we can find something in the existing slices * for that size plus free slices */ - addr = slice_find_area(mm, len, potential_mask, + addr = slice_find_area(mm, len, &potential_mask, psize, topdown, high_limit); #ifdef CONFIG_PPC_64K_PAGES if (addr == -ENOMEM && psize == MMU_PAGE_64K) { /* retry the search with 4k-page slices included */ slice_or_mask(&potential_mask, &compat_mask); - addr = slice_find_area(mm, len, potential_mask, + addr = slice_find_area(mm, len, &potential_mask, psize, topdown, high_limit); } #endif @@ -590,13 +597,13 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len, slice_range_to_mask(addr, len, &mask); slice_dbg(" found potential area at 0x%lx\n", addr); - slice_print_mask(" mask", mask); + slice_print_mask(" mask", &mask); convert: slice_andnot_mask(&mask, &good_mask); slice_andnot_mask(&mask, &compat_mask); if (mask.low_slices || !bitmap_empty(mask.high_slices, SLICE_NUM_HIGH)) { - slice_convert(mm, mask, psize); + slice_convert(mm, &mask, psize); if (psize > MMU_PAGE_BASE) on_each_cpu(slice_flush_segments, mm, 1); } @@ -725,7 +732,7 @@ void slice_set_range_psize(struct mm_struct *mm, unsigned long start, VM_BUG_ON(radix_enabled()); slice_range_to_mask(start, len, &mask); - slice_convert(mm, mask, psize); + slice_convert(mm, &mask, psize); } #ifdef CONFIG_HUGETLB_PAGE @@ -772,9 +779,9 @@ int is_hugepage_only_range(struct mm_struct *mm, unsigned long addr, #if 0 /* too verbose */ slice_dbg("is_hugepage_only_range(mm=%p, addr=%lx, len=%lx)\n", mm, addr, len); - slice_print_mask(" mask", mask); - slice_print_mask(" available", available); + slice_print_mask(" mask", &mask); + slice_print_mask(" available", &available); #endif - return !slice_check_fit(mm, mask, available); + return !slice_check_fit(mm, &mask, &available); } #endif From patchwork Sat Feb 10 08:11:36 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 871641 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3zdlGN0Cxgz9s4q for ; Sat, 10 Feb 2018 19:19:40 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="r3QlBNRt"; dkim-atps=neutral Received: from bilbo.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3zdlGM5tx2zDqZs for ; Sat, 10 Feb 2018 19:19:39 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="r3QlBNRt"; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:400e:c01::242; helo=mail-pl0-x242.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="r3QlBNRt"; dkim-atps=neutral Received: from mail-pl0-x242.google.com (mail-pl0-x242.google.com [IPv6:2607:f8b0:400e:c01::242]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3zdl5V4F4SzF0dt for ; Sat, 10 Feb 2018 19:11:58 +1100 (AEDT) Received: by mail-pl0-x242.google.com with SMTP id g18so2548676plo.7 for ; Sat, 10 Feb 2018 00:11:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=T9Az8chRCc6r9/Hpancr8XmXvOWRwnM9mdlGLj6eeKk=; b=r3QlBNRtJfgZiCDg9R3sjHYmBFTpZAebX76H9pgrHw5hsZ67UIedeQlRXiQXaLVf8+ iJXpGScKejvbSPzufMC2UJ2XzUz4NcgNrS/MDJD6+BYnriC8Et6pAOTKb9U1BknKF1to EGjZEAw1slW0y0m3fIfMg3kx89v55+/yQ8S44/TD7MAcVVBdB2aktrSjUxlkr2gBViqk +6mAWIKB9/O40+KOR3CPASmUTpP1Q/HarTcv3BQC2IejEaqBwRu9SAuLvwO+eHJ6jrXU dyiW76T7s7bxTwsrmR15cNe+v5fKjCT+f6+LnUCP+uvPtnYnJftSbx+OdBzfp1B+ZdOZ XxQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=T9Az8chRCc6r9/Hpancr8XmXvOWRwnM9mdlGLj6eeKk=; b=Za4dG6lhgfoFTxsdYVexMv/camU3KiIh3k2QJStYyc0T5nm0gLfUosLmbrGaddFDmZ QNw7rkMrJcLbEljvchZpU72AMuhqNWVWtDUtYAihFtqgeQX/11mW8El1onVb1Zsfcj1B qoFfYAj0wE9Xq80q2hvUl7+FqG9FHHqAHLLfIUGIlPZ/rxZN3U9xSk2H5xg0w349OYRG IdFpUNWt64O90mSfHazonN0FwQHMMrpvYN/Fbdg2EEbOLsF/taJRVrWaJFV1/0mxA/Td qWQaQbtUb6XY0U6U8ouNxKkZnXFaFr4lW6U8hf9MLnY+S5SGrhm1uiWRIUCP4fPD5TTC liGQ== X-Gm-Message-State: APf1xPB7LUHOzTv3q+/V405NY8/2vDCnJs07y64UvMYS+JlqzvzWIKwV PGEGHjRpTS4Os4OndVEZQEYLXA== X-Google-Smtp-Source: AH8x226DrkIiayyUSEnc997NJxk2qGq9SAOixDfEKC+ZNzg6ssOhObD+R1e9tZpdFIQElmaQkvjwHw== X-Received: by 2002:a17:902:44a4:: with SMTP id l33-v6mr4946893pld.115.1518250316098; Sat, 10 Feb 2018 00:11:56 -0800 (PST) Received: from roar.au.ibm.com ([202.7.219.42]) by smtp.gmail.com with ESMTPSA id e67sm10776388pfd.23.2018.02.10.00.11.52 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 10 Feb 2018 00:11:54 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [RFC PATCH 2/5] powerpc/mm/slice: implement a slice mask cache Date: Sat, 10 Feb 2018 18:11:36 +1000 Message-Id: <20180210081139.27236-3-npiggin@gmail.com> X-Mailer: git-send-email 2.15.1 In-Reply-To: <20180210081139.27236-1-npiggin@gmail.com> References: <20180210081139.27236-1-npiggin@gmail.com> X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Aneesh Kumar K . V" , Nicholas Piggin Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Calculating the slice mask can become a signifcant overhead for get_unmapped_area. This patch adds a struct slice_mask for each page size in the mm_context, and keeps these in synch with the slices psize arrays and slb_addr_limit. This saves about 30% kernel time on a single-page mmap/munmap micro benchmark. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/book3s/64/mmu.h | 20 +++++++++- arch/powerpc/mm/slice.c | 68 ++++++++++++++++++++++++-------- 2 files changed, 71 insertions(+), 17 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h index 0abeb0e2d616..b6d136fd8ffd 100644 --- a/arch/powerpc/include/asm/book3s/64/mmu.h +++ b/arch/powerpc/include/asm/book3s/64/mmu.h @@ -80,6 +80,16 @@ struct spinlock; /* Maximum possible number of NPUs in a system. */ #define NV_MAX_NPUS 8 +/* + * One bit per slice. We have lower slices which cover 256MB segments + * upto 4G range. That gets us 16 low slices. For the rest we track slices + * in 1TB size. + */ +struct slice_mask { + u64 low_slices; + DECLARE_BITMAP(high_slices, SLICE_NUM_HIGH); +}; + typedef struct { mm_context_id_t id; u16 user_psize; /* page size index */ @@ -91,9 +101,17 @@ typedef struct { struct npu_context *npu_context; #ifdef CONFIG_PPC_MM_SLICES + unsigned long slb_addr_limit; u64 low_slices_psize; /* SLB page size encodings */ unsigned char high_slices_psize[SLICE_ARRAY_SIZE]; - unsigned long slb_addr_limit; +# ifdef CONFIG_PPC_64K_PAGES + struct slice_mask mask_64k; +# endif + struct slice_mask mask_4k; +# ifdef CONFIG_HUGETLB_PAGE + struct slice_mask mask_16m; + struct slice_mask mask_16g; +# endif #else u16 sllp; /* SLB page size encoding */ #endif diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c index e8f6922d3c9b..837700bb50a9 100644 --- a/arch/powerpc/mm/slice.c +++ b/arch/powerpc/mm/slice.c @@ -37,15 +37,6 @@ #include static DEFINE_SPINLOCK(slice_convert_lock); -/* - * One bit per slice. We have lower slices which cover 256MB segments - * upto 4G range. That gets us 16 low slices. For the rest we track slices - * in 1TB size. - */ -struct slice_mask { - u64 low_slices; - DECLARE_BITMAP(high_slices, SLICE_NUM_HIGH); -}; #ifdef DEBUG int _slice_debug = 1; @@ -144,7 +135,7 @@ static void slice_mask_for_free(struct mm_struct *mm, struct slice_mask *ret, __set_bit(i, ret->high_slices); } -static void slice_mask_for_size(struct mm_struct *mm, int psize, +static void calc_slice_mask_for_size(struct mm_struct *mm, int psize, struct slice_mask *ret, unsigned long high_limit) { @@ -173,6 +164,40 @@ static void slice_mask_for_size(struct mm_struct *mm, int psize, } } +#ifdef CONFIG_PPC_BOOK3S_64 +static void recalc_slice_mask_cache(struct mm_struct *mm) +{ + unsigned long l = mm->context.slb_addr_limit; + calc_slice_mask_for_size(mm, MMU_PAGE_4K, &mm->context.mask_4k, l); +#ifdef CONFIG_PPC_64K_PAGES + calc_slice_mask_for_size(mm, MMU_PAGE_64K, &mm->context.mask_64k, l); +#endif +#ifdef CONFIG_HUGETLB_PAGE + calc_slice_mask_for_size(mm, MMU_PAGE_16M, &mm->context.mask_16m, l); + calc_slice_mask_for_size(mm, MMU_PAGE_16G, &mm->context.mask_16g, l); +#endif +} + +static const struct slice_mask *slice_mask_for_size(struct mm_struct *mm, int psize) +{ +#ifdef CONFIG_PPC_64K_PAGES + if (psize == MMU_PAGE_64K) + return &mm->context.mask_64k; +#endif + if (psize == MMU_PAGE_4K) + return &mm->context.mask_4k; +#ifdef CONFIG_HUGETLB_PAGE + if (psize == MMU_PAGE_16M) + return &mm->context.mask_16m; + if (psize == MMU_PAGE_16G) + return &mm->context.mask_16g; +#endif + BUG(); +} +#else +#error "Must define the slice masks for page sizes supported by the platform" +#endif + static int slice_check_fit(struct mm_struct *mm, const struct slice_mask *mask, const struct slice_mask *available) @@ -246,6 +271,8 @@ static void slice_convert(struct mm_struct *mm, (unsigned long)mm->context.low_slices_psize, (unsigned long)mm->context.high_slices_psize); + recalc_slice_mask_cache(mm); + spin_unlock_irqrestore(&slice_convert_lock, flags); copro_flush_all_slbs(mm); @@ -448,7 +475,14 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len, } if (high_limit > mm->context.slb_addr_limit) { + unsigned long flags; + mm->context.slb_addr_limit = high_limit; + + spin_lock_irqsave(&slice_convert_lock, flags); + recalc_slice_mask_cache(mm); + spin_unlock_irqrestore(&slice_convert_lock, flags); + on_each_cpu(slice_flush_segments, mm, 1); } @@ -487,7 +521,7 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len, /* First make up a "good" mask of slices that have the right size * already */ - slice_mask_for_size(mm, psize, &good_mask, high_limit); + good_mask = *slice_mask_for_size(mm, psize); slice_print_mask(" good_mask", &good_mask); /* @@ -512,7 +546,7 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len, #ifdef CONFIG_PPC_64K_PAGES /* If we support combo pages, we can allow 64k pages in 4k slices */ if (psize == MMU_PAGE_64K) { - slice_mask_for_size(mm, MMU_PAGE_4K, &compat_mask, high_limit); + compat_mask = *slice_mask_for_size(mm, MMU_PAGE_4K); if (fixed) slice_or_mask(&good_mask, &compat_mask); } @@ -693,7 +727,7 @@ void slice_set_user_psize(struct mm_struct *mm, unsigned int psize) goto bail; mm->context.user_psize = psize; - wmb(); + wmb(); /* Why? */ lpsizes = mm->context.low_slices_psize; for (i = 0; i < SLICE_NUM_LOW; i++) @@ -720,6 +754,9 @@ void slice_set_user_psize(struct mm_struct *mm, unsigned int psize) (unsigned long)mm->context.low_slices_psize, (unsigned long)mm->context.high_slices_psize); + recalc_slice_mask_cache(mm); + spin_unlock_irqrestore(&slice_convert_lock, flags); + return; bail: spin_unlock_irqrestore(&slice_convert_lock, flags); } @@ -760,18 +797,17 @@ int is_hugepage_only_range(struct mm_struct *mm, unsigned long addr, { struct slice_mask mask, available; unsigned int psize = mm->context.user_psize; - unsigned long high_limit = mm->context.slb_addr_limit; if (radix_enabled()) return 0; slice_range_to_mask(addr, len, &mask); - slice_mask_for_size(mm, psize, &available, high_limit); + available = *slice_mask_for_size(mm, psize); #ifdef CONFIG_PPC_64K_PAGES /* We need to account for 4k slices too */ if (psize == MMU_PAGE_64K) { struct slice_mask compat_mask; - slice_mask_for_size(mm, MMU_PAGE_4K, &compat_mask, high_limit); + compat_mask = *slice_mask_for_size(mm, MMU_PAGE_4K); slice_or_mask(&available, &compat_mask); } #endif From patchwork Sat Feb 10 08:11:37 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 871642 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3zdlJz4C1xz9s4q for ; Sat, 10 Feb 2018 19:21:55 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="qK/jqOrf"; dkim-atps=neutral Received: from bilbo.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3zdlJz2tP4zDrDM for ; Sat, 10 Feb 2018 19:21:55 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="qK/jqOrf"; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:400e:c01::241; helo=mail-pl0-x241.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="qK/jqOrf"; dkim-atps=neutral Received: from mail-pl0-x241.google.com (mail-pl0-x241.google.com [IPv6:2607:f8b0:400e:c01::241]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3zdl5Y5CgMzF0fr for ; Sat, 10 Feb 2018 19:12:01 +1100 (AEDT) Received: by mail-pl0-x241.google.com with SMTP id 11so2541618plc.9 for ; Sat, 10 Feb 2018 00:12:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=/zdpZPKJVIRA3gd/j5dvNscxKKAuIEBIG2mK000cnLA=; b=qK/jqOrf/l0YwaR+HLmd6JVB/d34EkyKmwDGw5ZFnUzve3jcUMQIMqPIjnJSHGhMJE /pePeHhykTBtnKE1RQpKqsY7SVxcAXJeHEl5XmwpGzTrGnyBs/+pbejGbxNgTZz5PYEA Ymq7/l2b29Tj6wKPoCpSnFc7fSvIjidgQSfSa49SqTtodbOT7WE9DplL7Igr6w739e6Q xbRz0hwsSRqPa1xrH7L5HCDeGj41Ydv8QesjBv+LVws5Gbj291ypFyC7Ul9YrYwfrebi C8nrGIuiqKVp85qQHU5zGg7q7dWf6ODXWLj9q5OOC6A0zlnjBT6Rsq6ygx5Yl/Js/R40 KnhQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=/zdpZPKJVIRA3gd/j5dvNscxKKAuIEBIG2mK000cnLA=; b=snhyJxuNwHd4etrITHC2I+sPKJsTV3VbZ3+h4tKabpgnuA2c+FZlza4uk/6HnR4NVZ vnsFb/nQtl4ewWB3Ak4BbWbncZ4LTvSyPmCICf0qDQYfLJwlY1sjOZK6QOGiTHxItVAI 0/2C+mAAczJsGtac18IiqqoJ62601lX27BAg2NCKXprJyj50vzkgZRy02O0uzyttDbTV CJUdGp6qjSHV7kXHzm286fO6j84TYr6IVr0DVj7iXnWNeHEDtxEBOIG+O6Jw5JVR2N6L QdhpoQ3LE3PeK0zpq5sjFMAL+ZPcvkfyMgR1cZ4cTNj88Y79utO83d2TTFK3qqd8uwD8 HmzQ== X-Gm-Message-State: APf1xPCJP/G4NbBbPAUR8PBrWq6Nl2uHMOpMEqHxKuFJpHHJGJr3mLGW znZgk7AggEEtoi7/PKH3BLnUVA== X-Google-Smtp-Source: AH8x225H+VQLdfGX1kaYFJs0v4T6zZiQkau8iay6AcxdiUJv/PRIWjNebcgk/AuHZbWLN51awkXI4Q== X-Received: by 2002:a17:902:bb8d:: with SMTP id m13-v6mr5004123pls.49.1518250319459; Sat, 10 Feb 2018 00:11:59 -0800 (PST) Received: from roar.au.ibm.com ([202.7.219.42]) by smtp.gmail.com with ESMTPSA id e67sm10776388pfd.23.2018.02.10.00.11.56 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 10 Feb 2018 00:11:58 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [RFC PATCH 3/5] powerpc/mm/slice: implement slice_check_range_fits Date: Sat, 10 Feb 2018 18:11:37 +1000 Message-Id: <20180210081139.27236-4-npiggin@gmail.com> X-Mailer: git-send-email 2.15.1 In-Reply-To: <20180210081139.27236-1-npiggin@gmail.com> References: <20180210081139.27236-1-npiggin@gmail.com> X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Aneesh Kumar K . V" , Nicholas Piggin Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Rather than build slice masks from a range then use that to check for fit in a candidate mask, implement slice_check_range_fits that checks if a range fits in a mask directly. This allows several structures to be removed from stacks, and also we don't expect a huge range in a lot of these cases, so building and comparing a full mask is going to be more expensive than testing just one or two bits of the range. Signed-off-by: Nicholas Piggin --- arch/powerpc/mm/slice.c | 67 ++++++++++++++++++++++++++----------------------- 1 file changed, 35 insertions(+), 32 deletions(-) diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c index 837700bb50a9..98497c105d7d 100644 --- a/arch/powerpc/mm/slice.c +++ b/arch/powerpc/mm/slice.c @@ -198,22 +198,35 @@ static const struct slice_mask *slice_mask_for_size(struct mm_struct *mm, int ps #error "Must define the slice masks for page sizes supported by the platform" #endif -static int slice_check_fit(struct mm_struct *mm, - const struct slice_mask *mask, - const struct slice_mask *available) +static bool slice_check_range_fits(struct mm_struct *mm, + const struct slice_mask *available, + unsigned long start, unsigned long len) { - DECLARE_BITMAP(result, SLICE_NUM_HIGH); - /* - * Make sure we just do bit compare only to the max - * addr limit and not the full bit map size. - */ - unsigned long slice_count = GET_HIGH_SLICE_INDEX(mm->context.slb_addr_limit); + unsigned long end = start + len - 1; + u64 low_slices = 0; + + if (start < SLICE_LOW_TOP) { + unsigned long mend = min(end, (SLICE_LOW_TOP - 1)); + + low_slices = (1u << (GET_LOW_SLICE_INDEX(mend) + 1)) + - (1u << GET_LOW_SLICE_INDEX(start)); + } + if ((low_slices & available->low_slices) != low_slices) + return false; + + if ((start + len) > SLICE_LOW_TOP) { + unsigned long start_index = GET_HIGH_SLICE_INDEX(start); + unsigned long align_end = ALIGN(end, (1UL << SLICE_HIGH_SHIFT)); + unsigned long count = GET_HIGH_SLICE_INDEX(align_end) - start_index; + unsigned long i; - bitmap_and(result, mask->high_slices, - available->high_slices, slice_count); + for (i = start_index; i < start_index + count; i++) { + if (!test_bit(i, available->high_slices)) + return false; + } + } - return (mask->low_slices & available->low_slices) == mask->low_slices && - bitmap_equal(result, mask->high_slices, slice_count); + return true; } static void slice_flush_segments(void *parm) @@ -486,12 +499,6 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len, on_each_cpu(slice_flush_segments, mm, 1); } - /* - * init different masks - */ - mask.low_slices = 0; - bitmap_zero(mask.high_slices, SLICE_NUM_HIGH); - /* silence stupid warning */; potential_mask.low_slices = 0; bitmap_zero(potential_mask.high_slices, SLICE_NUM_HIGH); @@ -553,15 +560,11 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len, #endif /* First check hint if it's valid or if we have MAP_FIXED */ - if (addr != 0 || fixed) { - /* Build a mask for the requested range */ - slice_range_to_mask(addr, len, &mask); - slice_print_mask(" mask", &mask); - + if (addr || fixed) { /* Check if we fit in the good mask. If we do, we just return, * nothing else to do */ - if (slice_check_fit(mm, &mask, &good_mask)) { + if (slice_check_range_fits(mm, &good_mask, addr, len)) { slice_dbg(" fits good !\n"); return addr; } @@ -587,10 +590,11 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len, slice_or_mask(&potential_mask, &good_mask); slice_print_mask(" potential", &potential_mask); - if ((addr != 0 || fixed) && - slice_check_fit(mm, &mask, &potential_mask)) { - slice_dbg(" fits potential !\n"); - goto convert; + if (addr || fixed) { + if (slice_check_range_fits(mm, &potential_mask, addr, len)) { + slice_dbg(" fits potential !\n"); + goto convert; + } } /* If we have MAP_FIXED and failed the above steps, then error out */ @@ -795,13 +799,12 @@ void slice_set_range_psize(struct mm_struct *mm, unsigned long start, int is_hugepage_only_range(struct mm_struct *mm, unsigned long addr, unsigned long len) { - struct slice_mask mask, available; + struct slice_mask available; unsigned int psize = mm->context.user_psize; if (radix_enabled()) return 0; - slice_range_to_mask(addr, len, &mask); available = *slice_mask_for_size(mm, psize); #ifdef CONFIG_PPC_64K_PAGES /* We need to account for 4k slices too */ @@ -818,6 +821,6 @@ int is_hugepage_only_range(struct mm_struct *mm, unsigned long addr, slice_print_mask(" mask", &mask); slice_print_mask(" available", &available); #endif - return !slice_check_fit(mm, &mask, &available); + return !slice_check_range_fits(mm, &available, addr, len); } #endif From patchwork Sat Feb 10 08:11:38 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 871643 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3zdlMh5tnkz9s4q for ; Sat, 10 Feb 2018 19:24:16 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="U/XjR5py"; dkim-atps=neutral Received: from bilbo.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3zdlMh4b6bzF0b8 for ; Sat, 10 Feb 2018 19:24:16 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="U/XjR5py"; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:400e:c01::243; helo=mail-pl0-x243.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="U/XjR5py"; dkim-atps=neutral Received: from mail-pl0-x243.google.com (mail-pl0-x243.google.com [IPv6:2607:f8b0:400e:c01::243]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3zdl5d0hldzF0fr for ; Sat, 10 Feb 2018 19:12:04 +1100 (AEDT) Received: by mail-pl0-x243.google.com with SMTP id o13so2546735pli.6 for ; Sat, 10 Feb 2018 00:12:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=YqvlM+RJGUVVPVr4PZvMo0LT9mt8NGyjJrh9/v48hK4=; b=U/XjR5pyCC2JRTS6gGNBZ0MLNeE/be/GldEEbDZC7GaqbNsg11+05NYf9euYC2HuFB cqPrs7X59jeqszIkR35nShVjsj5deHlI7JmUtLSLI5fSUILWJLa+IEMIP/eq4x5GUK0d 6CaM7o8qO762nIdscII6riUlSytwcl/imEDGLPrZ59sbNTibIpe9vAYZEjuUKegkhPuc 9bmD7xNWRaGtFlkLSRxXcxXz4f9Mb7SPNJLP4VWXFqc8w2P/7RG3Pb7fZxChC5YMvuI9 dwUgD33KbjSDoKQX7xvEHuCXSZPHbK2EUpIhL/xFRRAVUPINRMg1UIqwMImDGJhKvHfF Q8QQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=YqvlM+RJGUVVPVr4PZvMo0LT9mt8NGyjJrh9/v48hK4=; b=tcYj0/F08X5dy+qMlC050/OazuDbR1imIcNCOcMuIVxijFe3KsTNtQ+ZkXVnvRzHK3 4/k8qQ3Oyq3R6yV89IUW7PYj0MfTghblm5gDjyuNZyw3zB+psfadcuR0+CqpeXpgzwlJ nuzey7J7gh7zxkOk58SRRUvlzWmjypmYbqoEOJ9eKiauk6a+5r4EbI+Rfa0VqJjJDFTT qBagso2HYK5eI7VjfFpjPtI/UxTg6NKt58meZiTKOCfVcDq4ASvCVhX7FkleCqwOzTGN M3OU2PAllmJelc5lmvlJw+JfB2fl+dkwlDX0G/+vg9lnxUn++xKp5vvvdB/2Vs40Te+E mv9Q== X-Gm-Message-State: APf1xPDFr/wlq4DcTZo+S2ZYtZb4Z2H7kjvwmpXxVHA3V2KZSYVwIdnq kg6owXooNDxMqO4/21cInDcLfA== X-Google-Smtp-Source: AH8x224zTlaNgE7zR4hCCrFTmQCAhiZVS5tn1xoJEH4G2qRb/tDtU0tEcQeACs9U2f19oXNaqhn4zQ== X-Received: by 2002:a17:902:5417:: with SMTP id d23-v6mr4934181pli.330.1518250322723; Sat, 10 Feb 2018 00:12:02 -0800 (PST) Received: from roar.au.ibm.com ([202.7.219.42]) by smtp.gmail.com with ESMTPSA id e67sm10776388pfd.23.2018.02.10.00.11.59 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 10 Feb 2018 00:12:01 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [RFC PATCH 4/5] powerpc/mm/slice: Use const pointers to cached slice masks where possible Date: Sat, 10 Feb 2018 18:11:38 +1000 Message-Id: <20180210081139.27236-5-npiggin@gmail.com> X-Mailer: git-send-email 2.15.1 In-Reply-To: <20180210081139.27236-1-npiggin@gmail.com> References: <20180210081139.27236-1-npiggin@gmail.com> X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Aneesh Kumar K . V" , Nicholas Piggin Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" The slice_mask cache was a basic conversion which copied the slice mask into caller's structures, because that's how the original code worked. In most cases the pointer can be used directly instead, saving a copy and an on-stack structure. This also converts the slice_mask bit operation helpers to be the usual 3-operand kind, which is clearer to work with. And we remove some unnecessary intermediate bitmaps, reducing stack and copy overhead further. Signed-off-by: Nicholas Piggin --- arch/powerpc/mm/slice.c | 78 ++++++++++++++++++++++++++++--------------------- 1 file changed, 44 insertions(+), 34 deletions(-) diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c index 98497c105d7d..b2e6c7667bc5 100644 --- a/arch/powerpc/mm/slice.c +++ b/arch/powerpc/mm/slice.c @@ -430,25 +430,28 @@ static unsigned long slice_find_area(struct mm_struct *mm, unsigned long len, return slice_find_area_bottomup(mm, len, mask, psize, high_limit); } -static inline void slice_or_mask(struct slice_mask *dst, +static inline void slice_copy_mask(struct slice_mask *dst, const struct slice_mask *src) { - DECLARE_BITMAP(result, SLICE_NUM_HIGH); + dst->low_slices = src->low_slices; + bitmap_copy(dst->high_slices, src->high_slices, SLICE_NUM_HIGH); +} - dst->low_slices |= src->low_slices; - bitmap_or(result, dst->high_slices, src->high_slices, SLICE_NUM_HIGH); - bitmap_copy(dst->high_slices, result, SLICE_NUM_HIGH); +static inline void slice_or_mask(struct slice_mask *dst, + const struct slice_mask *src1, + const struct slice_mask *src2) +{ + dst->low_slices = src1->low_slices | src2->low_slices; + bitmap_or(dst->high_slices, src1->high_slices, src2->high_slices, SLICE_NUM_HIGH); } static inline void slice_andnot_mask(struct slice_mask *dst, - const struct slice_mask *src) + const struct slice_mask *src1, + const struct slice_mask *src2) { - DECLARE_BITMAP(result, SLICE_NUM_HIGH); - - dst->low_slices &= ~src->low_slices; + dst->low_slices = src1->low_slices & ~src2->low_slices; - bitmap_andnot(result, dst->high_slices, src->high_slices, SLICE_NUM_HIGH); - bitmap_copy(dst->high_slices, result, SLICE_NUM_HIGH); + bitmap_andnot(dst->high_slices, src1->high_slices, src2->high_slices, SLICE_NUM_HIGH); } #ifdef CONFIG_PPC_64K_PAGES @@ -461,10 +464,10 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len, unsigned long flags, unsigned int psize, int topdown) { - struct slice_mask mask; struct slice_mask good_mask; struct slice_mask potential_mask; - struct slice_mask compat_mask; + const struct slice_mask *maskp; + const struct slice_mask *compat_maskp = NULL; int fixed = (flags & MAP_FIXED); int pshift = max_t(int, mmu_psize_defs[psize].shift, PAGE_SHIFT); unsigned long page_size = 1UL << pshift; @@ -503,9 +506,6 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len, potential_mask.low_slices = 0; bitmap_zero(potential_mask.high_slices, SLICE_NUM_HIGH); - compat_mask.low_slices = 0; - bitmap_zero(compat_mask.high_slices, SLICE_NUM_HIGH); - /* Sanity checks */ BUG_ON(mm->task_size == 0); BUG_ON(mm->context.slb_addr_limit == 0); @@ -528,7 +528,7 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len, /* First make up a "good" mask of slices that have the right size * already */ - good_mask = *slice_mask_for_size(mm, psize); + maskp = slice_mask_for_size(mm, psize); slice_print_mask(" good_mask", &good_mask); /* @@ -553,11 +553,16 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len, #ifdef CONFIG_PPC_64K_PAGES /* If we support combo pages, we can allow 64k pages in 4k slices */ if (psize == MMU_PAGE_64K) { - compat_mask = *slice_mask_for_size(mm, MMU_PAGE_4K); + compat_maskp = slice_mask_for_size(mm, MMU_PAGE_4K); if (fixed) - slice_or_mask(&good_mask, &compat_mask); - } + slice_or_mask(&good_mask, maskp, compat_maskp); + else + slice_copy_mask(&good_mask, maskp); + } else #endif + { + slice_copy_mask(&good_mask, maskp); + } /* First check hint if it's valid or if we have MAP_FIXED */ if (addr || fixed) { @@ -587,7 +592,7 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len, * empty and thus can be converted */ slice_mask_for_free(mm, &potential_mask, high_limit); - slice_or_mask(&potential_mask, &good_mask); + slice_or_mask(&potential_mask, &potential_mask, &good_mask); slice_print_mask(" potential", &potential_mask); if (addr || fixed) { @@ -624,7 +629,7 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len, #ifdef CONFIG_PPC_64K_PAGES if (addr == -ENOMEM && psize == MMU_PAGE_64K) { /* retry the search with 4k-page slices included */ - slice_or_mask(&potential_mask, &compat_mask); + slice_or_mask(&potential_mask, &potential_mask, compat_maskp); addr = slice_find_area(mm, len, &potential_mask, psize, topdown, high_limit); } @@ -633,15 +638,17 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len, if (addr == -ENOMEM) return -ENOMEM; - slice_range_to_mask(addr, len, &mask); + slice_range_to_mask(addr, len, &potential_mask); slice_dbg(" found potential area at 0x%lx\n", addr); - slice_print_mask(" mask", &mask); + slice_print_mask(" mask", maskp); convert: - slice_andnot_mask(&mask, &good_mask); - slice_andnot_mask(&mask, &compat_mask); - if (mask.low_slices || !bitmap_empty(mask.high_slices, SLICE_NUM_HIGH)) { - slice_convert(mm, &mask, psize); + slice_andnot_mask(&potential_mask, &potential_mask, &good_mask); + if (compat_maskp && !fixed) + slice_andnot_mask(&potential_mask, &potential_mask, compat_maskp); + if (potential_mask.low_slices || + !bitmap_empty(potential_mask.high_slices, SLICE_NUM_HIGH)) { + slice_convert(mm, &potential_mask, psize); if (psize > MMU_PAGE_BASE) on_each_cpu(slice_flush_segments, mm, 1); } @@ -799,19 +806,22 @@ void slice_set_range_psize(struct mm_struct *mm, unsigned long start, int is_hugepage_only_range(struct mm_struct *mm, unsigned long addr, unsigned long len) { - struct slice_mask available; + const struct slice_mask *maskp; unsigned int psize = mm->context.user_psize; if (radix_enabled()) return 0; - available = *slice_mask_for_size(mm, psize); + maskp = slice_mask_for_size(mm, psize); #ifdef CONFIG_PPC_64K_PAGES /* We need to account for 4k slices too */ if (psize == MMU_PAGE_64K) { - struct slice_mask compat_mask; - compat_mask = *slice_mask_for_size(mm, MMU_PAGE_4K); - slice_or_mask(&available, &compat_mask); + const struct slice_mask *compat_maskp; + struct slice_mask available; + + compat_maskp = slice_mask_for_size(mm, MMU_PAGE_4K); + slice_or_mask(&available, maskp, compat_maskp); + return !slice_check_range_fits(mm, &available, addr, len); } #endif @@ -821,6 +831,6 @@ int is_hugepage_only_range(struct mm_struct *mm, unsigned long addr, slice_print_mask(" mask", &mask); slice_print_mask(" available", &available); #endif - return !slice_check_range_fits(mm, &available, addr, len); + return !slice_check_range_fits(mm, maskp, addr, len); } #endif From patchwork Sat Feb 10 08:11:39 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 871644 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3zdlQg2zM6z9s4q for ; Sat, 10 Feb 2018 19:26:51 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="FYKqGWGH"; dkim-atps=neutral Received: from bilbo.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3zdlQg100qzDqZ7 for ; Sat, 10 Feb 2018 19:26:51 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="FYKqGWGH"; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:400e:c00::243; helo=mail-pf0-x243.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="FYKqGWGH"; dkim-atps=neutral Received: from mail-pf0-x243.google.com (mail-pf0-x243.google.com [IPv6:2607:f8b0:400e:c00::243]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3zdl5h5L0bzF0gG for ; Sat, 10 Feb 2018 19:12:08 +1100 (AEDT) Received: by mail-pf0-x243.google.com with SMTP id q79so2238195pfl.5 for ; Sat, 10 Feb 2018 00:12:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=9f6MSHX9vnrMXBfvfiQqT9xyD/s/Zpr9+M6k80T/vik=; b=FYKqGWGHpov0wttuv2YD6q6dcE/Gwvz6rwAbi6WFO54LRlpC9vixA5YEsCTK5hRQDa ImBdOvlFYEsWtmvnPoQyOwHB6DO1nrzzKQ4APjsESYiSkqXyKJQg9ZZhlggbfBoMFdUt CnWkqH4JFMo6kmsui2skFzCDp3HWJRhhVm3bm9oYZqjtijplit/Q53xLQyNxedG+j5Hc Q0xO8FxTk6N93/tLVxZajzOFK9pot5BcXRvItsAQ3rHizxlWgqWQ8E0t6DKDs6fCBfeB YEOP3yCX7C2pOGhLq//rvTx0l/78BBrI4sv2/x5Gyjna3Cf4KOei4NQXPd7iVbbwPyGj muQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=9f6MSHX9vnrMXBfvfiQqT9xyD/s/Zpr9+M6k80T/vik=; b=P7FA+Loj6rnUNe2VuavwWkZgnZXLd680IS1FkCma04Sj+IavD+PqDaRTFRY8w+LQh3 xCyd+1e5NjIYo/5ON1b5jj4Da0VMXNH60px8B+ybgKVAfzWwPuvgRI306BhJQEQFF9N5 lTVvJvShKszSJrIbdXKQ6lAT2PxZQ/sOiVDS4SWA5v9VoIhAyoZhk+1sgWydO96rvG/7 R4qAJx3UQi0Nwa254DQKMuPE6xamuZOkD/0CkKCgK4sG3q5K7LuSPswOS/2N9FrIJLHt uQMEOpvxsBUAj6sHQU6O8+tPJLQdrhunXDfSDLU8vLLVELsixmwKkZJAosykbh+ntCZM e5Gg== X-Gm-Message-State: APf1xPANu+xyvk89UI2rLy9DrOAzYEErJlPqElexMskaDKhsVCTL3LFn V4Vq9h0NOclTVdNl4jmfPXilBw== X-Google-Smtp-Source: AH8x226hP6mXG7dvMdPqqVuriQjfEdH2oXGCet9tvyuuNFcgPBbiHFf2ibrMpBSZ/4Ngo4WKhB/xfA== X-Received: by 10.101.78.201 with SMTP id w9mr4556675pgq.43.1518250326256; Sat, 10 Feb 2018 00:12:06 -0800 (PST) Received: from roar.au.ibm.com ([202.7.219.42]) by smtp.gmail.com with ESMTPSA id e67sm10776388pfd.23.2018.02.10.00.12.03 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 10 Feb 2018 00:12:05 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [RFC PATCH 5/5] powerpc/mm/slice: use the dynamic high slice size to limit bitmap operations Date: Sat, 10 Feb 2018 18:11:39 +1000 Message-Id: <20180210081139.27236-6-npiggin@gmail.com> X-Mailer: git-send-email 2.15.1 In-Reply-To: <20180210081139.27236-1-npiggin@gmail.com> References: <20180210081139.27236-1-npiggin@gmail.com> X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Aneesh Kumar K . V" , Nicholas Piggin Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" The number of high slices a process might use now depends on its address space size, and what allocation address it has requested. This patch uses that limit throughout call chains where possible, rather than use the fixed SLICE_NUM_HIGH for bitmap operations. This saves some cost for processes that don't use very large address spaces. Signed-off-by: Nicholas Piggin --- arch/powerpc/mm/slice.c | 98 +++++++++++++++++++++++++++---------------------- 1 file changed, 55 insertions(+), 43 deletions(-) diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c index b2e6c7667bc5..bec68ea07e29 100644 --- a/arch/powerpc/mm/slice.c +++ b/arch/powerpc/mm/slice.c @@ -61,13 +61,12 @@ static void slice_print_mask(const char *label, const struct slice_mask *mask) { #endif static void slice_range_to_mask(unsigned long start, unsigned long len, - struct slice_mask *ret) + struct slice_mask *ret, + unsigned long high_slices) { unsigned long end = start + len - 1; ret->low_slices = 0; - bitmap_zero(ret->high_slices, SLICE_NUM_HIGH); - if (start < SLICE_LOW_TOP) { unsigned long mend = min(end, (SLICE_LOW_TOP - 1)); @@ -75,6 +74,7 @@ static void slice_range_to_mask(unsigned long start, unsigned long len, - (1u << GET_LOW_SLICE_INDEX(start)); } + bitmap_zero(ret->high_slices, high_slices); if ((start + len) > SLICE_LOW_TOP) { unsigned long start_index = GET_HIGH_SLICE_INDEX(start); unsigned long align_end = ALIGN(end, (1UL << SLICE_HIGH_SHIFT)); @@ -116,28 +116,27 @@ static int slice_high_has_vma(struct mm_struct *mm, unsigned long slice) } static void slice_mask_for_free(struct mm_struct *mm, struct slice_mask *ret, - unsigned long high_limit) + unsigned long high_slices) { unsigned long i; ret->low_slices = 0; - bitmap_zero(ret->high_slices, SLICE_NUM_HIGH); - for (i = 0; i < SLICE_NUM_LOW; i++) if (!slice_low_has_vma(mm, i)) ret->low_slices |= 1u << i; - if (high_limit <= SLICE_LOW_TOP) + if (!high_slices) return; - for (i = 0; i < GET_HIGH_SLICE_INDEX(high_limit); i++) + bitmap_zero(ret->high_slices, high_slices); + for (i = 0; i < high_slices; i++) if (!slice_high_has_vma(mm, i)) __set_bit(i, ret->high_slices); } static void calc_slice_mask_for_size(struct mm_struct *mm, int psize, struct slice_mask *ret, - unsigned long high_limit) + unsigned long high_slices) { unsigned char *hpsizes; int index, mask_index; @@ -145,18 +144,17 @@ static void calc_slice_mask_for_size(struct mm_struct *mm, int psize, u64 lpsizes; ret->low_slices = 0; - bitmap_zero(ret->high_slices, SLICE_NUM_HIGH); - lpsizes = mm->context.low_slices_psize; for (i = 0; i < SLICE_NUM_LOW; i++) if (((lpsizes >> (i * 4)) & 0xf) == psize) ret->low_slices |= 1u << i; - if (high_limit <= SLICE_LOW_TOP) + if (!high_slices) return; + bitmap_zero(ret->high_slices, high_slices); hpsizes = mm->context.high_slices_psize; - for (i = 0; i < GET_HIGH_SLICE_INDEX(high_limit); i++) { + for (i = 0; i < high_slices; i++) { mask_index = i & 0x1; index = i >> 1; if (((hpsizes[index] >> (mask_index * 4)) & 0xf) == psize) @@ -165,16 +163,15 @@ static void calc_slice_mask_for_size(struct mm_struct *mm, int psize, } #ifdef CONFIG_PPC_BOOK3S_64 -static void recalc_slice_mask_cache(struct mm_struct *mm) +static void recalc_slice_mask_cache(struct mm_struct *mm, unsigned long high_slices) { - unsigned long l = mm->context.slb_addr_limit; - calc_slice_mask_for_size(mm, MMU_PAGE_4K, &mm->context.mask_4k, l); + calc_slice_mask_for_size(mm, MMU_PAGE_4K, &mm->context.mask_4k, high_slices); #ifdef CONFIG_PPC_64K_PAGES - calc_slice_mask_for_size(mm, MMU_PAGE_64K, &mm->context.mask_64k, l); + calc_slice_mask_for_size(mm, MMU_PAGE_64K, &mm->context.mask_64k, high_slices); #endif #ifdef CONFIG_HUGETLB_PAGE - calc_slice_mask_for_size(mm, MMU_PAGE_16M, &mm->context.mask_16m, l); - calc_slice_mask_for_size(mm, MMU_PAGE_16G, &mm->context.mask_16g, l); + calc_slice_mask_for_size(mm, MMU_PAGE_16M, &mm->context.mask_16m, high_slices); + calc_slice_mask_for_size(mm, MMU_PAGE_16G, &mm->context.mask_16g, high_slices); #endif } @@ -252,6 +249,7 @@ static void slice_convert(struct mm_struct *mm, unsigned char *hpsizes; u64 lpsizes; unsigned long i, flags; + unsigned long high_slices; slice_dbg("slice_convert(mm=%p, psize=%d)\n", mm, psize); slice_print_mask(" mask", mask); @@ -271,7 +269,8 @@ static void slice_convert(struct mm_struct *mm, mm->context.low_slices_psize = lpsizes; hpsizes = mm->context.high_slices_psize; - for (i = 0; i < GET_HIGH_SLICE_INDEX(mm->context.slb_addr_limit); i++) { + high_slices = GET_HIGH_SLICE_INDEX(mm->context.slb_addr_limit); + for (i = 0; i < high_slices; i++) { mask_index = i & 0x1; index = i >> 1; if (test_bit(i, mask->high_slices)) @@ -284,7 +283,7 @@ static void slice_convert(struct mm_struct *mm, (unsigned long)mm->context.low_slices_psize, (unsigned long)mm->context.high_slices_psize); - recalc_slice_mask_cache(mm); + recalc_slice_mask_cache(mm, high_slices); spin_unlock_irqrestore(&slice_convert_lock, flags); @@ -431,27 +430,32 @@ static unsigned long slice_find_area(struct mm_struct *mm, unsigned long len, } static inline void slice_copy_mask(struct slice_mask *dst, - const struct slice_mask *src) + const struct slice_mask *src, + unsigned long high_slices) { dst->low_slices = src->low_slices; - bitmap_copy(dst->high_slices, src->high_slices, SLICE_NUM_HIGH); + bitmap_copy(dst->high_slices, src->high_slices, high_slices); } static inline void slice_or_mask(struct slice_mask *dst, const struct slice_mask *src1, - const struct slice_mask *src2) + const struct slice_mask *src2, + unsigned long high_slices) { dst->low_slices = src1->low_slices | src2->low_slices; - bitmap_or(dst->high_slices, src1->high_slices, src2->high_slices, SLICE_NUM_HIGH); + bitmap_or(dst->high_slices, src1->high_slices, src2->high_slices, + high_slices); } static inline void slice_andnot_mask(struct slice_mask *dst, const struct slice_mask *src1, - const struct slice_mask *src2) + const struct slice_mask *src2, + unsigned long high_slices) { dst->low_slices = src1->low_slices & ~src2->low_slices; - bitmap_andnot(dst->high_slices, src1->high_slices, src2->high_slices, SLICE_NUM_HIGH); + bitmap_andnot(dst->high_slices, src1->high_slices, src2->high_slices, + high_slices); } #ifdef CONFIG_PPC_64K_PAGES @@ -474,6 +478,7 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len, struct mm_struct *mm = current->mm; unsigned long newaddr; unsigned long high_limit; + unsigned long high_slices; high_limit = DEFAULT_MAP_WINDOW; if (addr >= high_limit || (fixed && (addr + len > high_limit))) @@ -490,13 +495,14 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len, return -ENOMEM; } + high_slices = GET_HIGH_SLICE_INDEX(high_limit); if (high_limit > mm->context.slb_addr_limit) { unsigned long flags; mm->context.slb_addr_limit = high_limit; spin_lock_irqsave(&slice_convert_lock, flags); - recalc_slice_mask_cache(mm); + recalc_slice_mask_cache(mm, high_slices); spin_unlock_irqrestore(&slice_convert_lock, flags); on_each_cpu(slice_flush_segments, mm, 1); @@ -504,7 +510,7 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len, /* silence stupid warning */; potential_mask.low_slices = 0; - bitmap_zero(potential_mask.high_slices, SLICE_NUM_HIGH); + bitmap_zero(potential_mask.high_slices, high_slices); /* Sanity checks */ BUG_ON(mm->task_size == 0); @@ -555,13 +561,13 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len, if (psize == MMU_PAGE_64K) { compat_maskp = slice_mask_for_size(mm, MMU_PAGE_4K); if (fixed) - slice_or_mask(&good_mask, maskp, compat_maskp); + slice_or_mask(&good_mask, maskp, compat_maskp, high_slices); else - slice_copy_mask(&good_mask, maskp); + slice_copy_mask(&good_mask, maskp, high_slices); } else #endif { - slice_copy_mask(&good_mask, maskp); + slice_copy_mask(&good_mask, maskp, high_slices); } /* First check hint if it's valid or if we have MAP_FIXED */ @@ -591,8 +597,8 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len, * We don't fit in the good mask, check what other slices are * empty and thus can be converted */ - slice_mask_for_free(mm, &potential_mask, high_limit); - slice_or_mask(&potential_mask, &potential_mask, &good_mask); + slice_mask_for_free(mm, &potential_mask, high_slices); + slice_or_mask(&potential_mask, &potential_mask, &good_mask, high_slices); slice_print_mask(" potential", &potential_mask); if (addr || fixed) { @@ -629,7 +635,7 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len, #ifdef CONFIG_PPC_64K_PAGES if (addr == -ENOMEM && psize == MMU_PAGE_64K) { /* retry the search with 4k-page slices included */ - slice_or_mask(&potential_mask, &potential_mask, compat_maskp); + slice_or_mask(&potential_mask, &potential_mask, compat_maskp, high_slices); addr = slice_find_area(mm, len, &potential_mask, psize, topdown, high_limit); } @@ -638,16 +644,16 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len, if (addr == -ENOMEM) return -ENOMEM; - slice_range_to_mask(addr, len, &potential_mask); + slice_range_to_mask(addr, len, &potential_mask, high_slices); slice_dbg(" found potential area at 0x%lx\n", addr); slice_print_mask(" mask", maskp); convert: - slice_andnot_mask(&potential_mask, &potential_mask, &good_mask); + slice_andnot_mask(&potential_mask, &potential_mask, &good_mask, high_slices); if (compat_maskp && !fixed) - slice_andnot_mask(&potential_mask, &potential_mask, compat_maskp); + slice_andnot_mask(&potential_mask, &potential_mask, compat_maskp, high_slices); if (potential_mask.low_slices || - !bitmap_empty(potential_mask.high_slices, SLICE_NUM_HIGH)) { + !bitmap_empty(potential_mask.high_slices, high_slices)) { slice_convert(mm, &potential_mask, psize); if (psize > MMU_PAGE_BASE) on_each_cpu(slice_flush_segments, mm, 1); @@ -724,6 +730,7 @@ void slice_set_user_psize(struct mm_struct *mm, unsigned int psize) int index, mask_index; unsigned char *hpsizes; unsigned long flags, lpsizes; + unsigned long high_slices; unsigned int old_psize; int i; @@ -749,7 +756,8 @@ void slice_set_user_psize(struct mm_struct *mm, unsigned int psize) mm->context.low_slices_psize = lpsizes; hpsizes = mm->context.high_slices_psize; - for (i = 0; i < SLICE_NUM_HIGH; i++) { + high_slices = GET_HIGH_SLICE_INDEX(mm->context.slb_addr_limit); + for (i = 0; i < high_slices; i++) { mask_index = i & 0x1; index = i >> 1; if (((hpsizes[index] >> (mask_index * 4)) & 0xf) == old_psize) @@ -765,7 +773,7 @@ void slice_set_user_psize(struct mm_struct *mm, unsigned int psize) (unsigned long)mm->context.low_slices_psize, (unsigned long)mm->context.high_slices_psize); - recalc_slice_mask_cache(mm); + recalc_slice_mask_cache(mm, high_slices); spin_unlock_irqrestore(&slice_convert_lock, flags); return; bail: @@ -776,10 +784,12 @@ void slice_set_range_psize(struct mm_struct *mm, unsigned long start, unsigned long len, unsigned int psize) { struct slice_mask mask; + unsigned long high_slices; VM_BUG_ON(radix_enabled()); - slice_range_to_mask(start, len, &mask); + high_slices = GET_HIGH_SLICE_INDEX(mm->context.slb_addr_limit); + slice_range_to_mask(start, len, &mask, high_slices); slice_convert(mm, &mask, psize); } @@ -818,9 +828,11 @@ int is_hugepage_only_range(struct mm_struct *mm, unsigned long addr, if (psize == MMU_PAGE_64K) { const struct slice_mask *compat_maskp; struct slice_mask available; + unsigned long high_slices; compat_maskp = slice_mask_for_size(mm, MMU_PAGE_4K); - slice_or_mask(&available, maskp, compat_maskp); + high_slices = GET_HIGH_SLICE_INDEX(mm->context.slb_addr_limit); + slice_or_mask(&available, maskp, compat_maskp, high_slices); return !slice_check_range_fits(mm, &available, addr, len); } #endif