From patchwork Fri Sep 8 22:44:43 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ram Pai X-Patchwork-Id: 811859 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3xpt0B22WHz9s7v for ; Sat, 9 Sep 2017 08:53:30 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="KaYLBsrR"; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3xpt093495zDrSN for ; Sat, 9 Sep 2017 08:53:29 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="KaYLBsrR"; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:400d:c0d::244; helo=mail-qt0-x244.google.com; envelope-from=ram.n.pai@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="KaYLBsrR"; dkim-atps=neutral Received: from mail-qt0-x244.google.com (mail-qt0-x244.google.com [IPv6:2607:f8b0:400d:c0d::244]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3xpsrB1tfqzDrcn for ; Sat, 9 Sep 2017 08:46:34 +1000 (AEST) Received: by mail-qt0-x244.google.com with SMTP id p55so2360292qtc.2 for ; Fri, 08 Sep 2017 15:46:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=89SJoLiCMxSYjsbgb585pTf8B2eB4iW44NTJaJzCYuE=; b=KaYLBsrRvog6zs7chz16kXvkNZon/sKNh6Dos9/jE5IFUDPQD4mOfl1fx14fFFcnzu 7amq9ZpJBLWdO83EGUzFBkPZDWSC+KJLq9VvsTPbTN51GdizaLNRbjHzkdPQsEbvMlhT 5ysikKjrwFviaO0EnKmD3DXW/GY6UcIEltvFLuUS0GbSNra7gl4FmE4aKyt5cCz2b0Mc cR+8YfPrn9bJ13gYAL0iRNlD5DmpPUVRs87pzD/l8IG36gNh9xSiOw0CZ1V66HiaNu97 EzeZPibNVvltRBe15nHGLnLDhjds5amZ/QG0XgFrOEJlSnnN9r+lWWK87Y9Un28uEaWS FrQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=89SJoLiCMxSYjsbgb585pTf8B2eB4iW44NTJaJzCYuE=; b=H63e3QuWzuCZMmOoYRf6v4kZayOLcAUoOSLniIct5DNv7rBglWpN2cEo4kABxZWnFN TPQhMmeqhvrA70GhW70EKE4MxF/xFNt2pah/tZEOXSfgM2lWV0XsQeiMOUsp5O+doJ2e 0XCtWpJEis+mYoaT63sVs++CkTCL4wBXfqNdMjfNPro5em1ACgZuYD/SCOkn7HKo/Xi8 HBJjG0kpQ89g52mwQYOi32jJi4sBgIvHWOCkQ9IaqF5l04l6nydZPPViHTwRI9Oqn9Zm GBQ8MV2deEPCn94tvJWnSOkGAcIHwTU2IgVBxnab7JxLpIlSolCYmWGo6BlIYFtYtB25 qB8Q== X-Gm-Message-State: AHPjjUiMn0GnuKRbR31uGn+jpka66mrFP56IPwKwIwyXqwkCRFXSKXYT oDp0Y22QAzqX5A== X-Google-Smtp-Source: AOwi7QC1tNfPwHYvFSqhKuyfgkLERJbMRA0d357gRVvoX/K1bpuAbcVvlI2/jm6lrDZZ2fyydvgq7A== X-Received: by 10.200.24.146 with SMTP id s18mr6107483qtj.38.1504910792083; Fri, 08 Sep 2017 15:46:32 -0700 (PDT) Received: from localhost.localdomain (50-39-103-96.bvtn.or.frontiernet.net. [50.39.103.96]) by smtp.gmail.com with ESMTPSA id x124sm2033726qka.85.2017.09.08.15.46.30 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 08 Sep 2017 15:46:31 -0700 (PDT) From: Ram Pai To: mpe@ellerman.id.au, linuxppc-dev@lists.ozlabs.org Subject: [PATCH 3/7] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages Date: Fri, 8 Sep 2017 15:44:43 -0700 Message-Id: <1504910713-7094-4-git-send-email-linuxram@us.ibm.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1504910713-7094-1-git-send-email-linuxram@us.ibm.com> References: <1504910713-7094-1-git-send-email-linuxram@us.ibm.com> X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ebiederm@xmission.com, linuxram@us.ibm.com, mhocko@kernel.org, paulus@samba.org, aneesh.kumar@linux.vnet.ibm.com, bauerman@linux.vnet.ibm.com, khandual@linux.vnet.ibm.com Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Rearrange 64K PTE bits to free up bits 3, 4, 5 and 6, in the 4K backed HPTE pages.These bits continue to be used for 64K backed HPTE pages in this patch, but will be freed up in the next patch. The bit numbers are big-endian as defined in the ISA3.0 The patch does the following change to the 4k htpe backed 64K PTE's format. H_PAGE_BUSY moves from bit 3 to bit 9 (B bit in the figure below) V0 which occupied bit 4 is not used anymore. V1 which occupied bit 5 is not used anymore. V2 which occupied bit 6 is not used anymore. V3 which occupied bit 7 is not used anymore. Before the patch, the 4k backed 64k PTE format was as follows 0 1 2 3 4 5 6 7 8 9 10...........................63 : : : : : : : : : : : : v v v v v v v v v v v v ,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-, |x|x|x|B|V0|V1|V2|V3|x| | |x|x|................|x|x|x|x| <- primary pte '_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_' |S|G|I|X|S |G |I |X |S|G|I|X|..................|S|G|I|X| <- secondary pte '_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_' After the patch, the 4k backed 64k PTE format is as follows 0 1 2 3 4 5 6 7 8 9 10...........................63 : : : : : : : : : : : : v v v v v v v v v v v v ,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-, |x|x|x| | | | | |x|B| |x|x|................|.|.|.|.| <- primary pte '_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_' |S|G|I|X|S |G |I |X |S|G|I|X|..................|S|G|I|X| <- secondary pte '_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_' the four bits S,G,I,X (one quadruplet per 4k HPTE) that cache the hash-bucket slot value, is initialized to 1,1,1,1 indicating -- an invalid slot. If a HPTE gets cached in a 1111 slot(i.e 7th slot of secondary hash bucket), it is released immediately. In other words, even though 1111 is a valid slot value in the hash bucket, we consider it invalid and release the slot and the HPTE. This gives us the opportunity to determine the validity of S,G,I,X bits based on its contents and not on any of the bits V0,V1,V2 or V3 in the primary PTE When we release a HPTE cached in the 1111 slot we also release a legitimate slot in the primary hash bucket and unmap its corresponding HPTE. This is to ensure that we do get a HPTE cached in a slot of the primary hash bucket, the next time we retry. Though treating 1111 slot as invalid, reduces the number of available slots in the hash bucket and may have an effect on the performance, the probabilty of hitting a 1111 slot is extermely low. Compared to the current scheme, the above scheme reduces the number of false hash table updates significantly and has the added advantage of releasing four valuable PTE bits for other purpose. NOTE:even though bits 3, 4, 5, 6, 7 are not used when the 64K PTE is backed by 4k HPTE, they continue to be used if the PTE gets backed by 64k HPTE. The next patch will decouple that aswell, and truely release the bits. This idea was jointly developed by Paul Mackerras, Aneesh, Michael Ellermen and myself. 4K PTE format remains unchanged currently. The patch does the following code changes a) PTE flags are split between 64k and 4k header files. b) __hash_page_4K() is reimplemented to reflect the above logic. Reviewed-by: Aneesh Kumar K.V Signed-off-by: Ram Pai Acked-by: Balbir Singh --- arch/powerpc/include/asm/book3s/64/hash-4k.h | 2 + arch/powerpc/include/asm/book3s/64/hash-64k.h | 8 +-- arch/powerpc/include/asm/book3s/64/hash.h | 1 - arch/powerpc/mm/hash64_64k.c | 106 +++++++++++++------------ arch/powerpc/mm/hash_utils_64.c | 4 +- 5 files changed, 63 insertions(+), 58 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h index 8909039..e66bfeb 100644 --- a/arch/powerpc/include/asm/book3s/64/hash-4k.h +++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h @@ -16,6 +16,8 @@ #define H_PUD_TABLE_SIZE (sizeof(pud_t) << H_PUD_INDEX_SIZE) #define H_PGD_TABLE_SIZE (sizeof(pgd_t) << H_PGD_INDEX_SIZE) +#define H_PAGE_BUSY _RPAGE_RSV1 /* software: PTE & hash are busy */ + /* PTE flags to conserve for HPTE identification */ #define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_HASHPTE | \ H_PAGE_F_SECOND | H_PAGE_F_GIX) diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h index 6652669..e038f1c 100644 --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h @@ -12,18 +12,14 @@ */ #define H_PAGE_COMBO _RPAGE_RPN0 /* this is a combo 4k page */ #define H_PAGE_4K_PFN _RPAGE_RPN1 /* PFN is for a single 4k page */ +#define H_PAGE_BUSY _RPAGE_RPN42 /* software: PTE & hash are busy */ + /* * We need to differentiate between explicit huge page and THP huge * page, since THP huge page also need to track real subpage details */ #define H_PAGE_THP_HUGE H_PAGE_4K_PFN -/* - * Used to track subpage group valid if H_PAGE_COMBO is set - * This overloads H_PAGE_F_GIX and H_PAGE_F_SECOND - */ -#define H_PAGE_COMBO_VALID (H_PAGE_F_GIX | H_PAGE_F_SECOND) - /* PTE flags to conserve for HPTE identification */ #define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_F_SECOND | \ H_PAGE_F_GIX | H_PAGE_HASHPTE | H_PAGE_COMBO) diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h index 060c059..8ce4112 100644 --- a/arch/powerpc/include/asm/book3s/64/hash.h +++ b/arch/powerpc/include/asm/book3s/64/hash.h @@ -9,7 +9,6 @@ */ #define H_PTE_NONE_MASK _PAGE_HPTEFLAGS #define H_PAGE_F_GIX_SHIFT 56 -#define H_PAGE_BUSY _RPAGE_RSV1 /* software: PTE & hash are busy */ #define H_PAGE_F_SECOND _RPAGE_RSV2 /* HPTE is in 2ndary HPTEG */ #define H_PAGE_F_GIX (_RPAGE_RSV3 | _RPAGE_RSV4 | _RPAGE_RPN44) #define H_PAGE_HASHPTE _RPAGE_RPN43 /* PTE has associated HPTE */ diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c index 1a68cb1..c6c5559 100644 --- a/arch/powerpc/mm/hash64_64k.c +++ b/arch/powerpc/mm/hash64_64k.c @@ -15,34 +15,22 @@ #include #include #include + /* - * index from 0 - 15 + * return true, if the entry has a slot value which + * the software considers as invalid. */ -bool __rpte_sub_valid(real_pte_t rpte, unsigned long index) +static inline bool hpte_soft_invalid(unsigned long slot) { - unsigned long g_idx; - unsigned long ptev = pte_val(rpte.pte); - - g_idx = (ptev & H_PAGE_COMBO_VALID) >> H_PAGE_F_GIX_SHIFT; - index = index >> 2; - if (g_idx & (0x1 << index)) - return true; - else - return false; + return ((slot & 0xfUL) == 0xfUL); } + /* * index from 0 - 15 */ -static unsigned long mark_subptegroup_valid(unsigned long ptev, unsigned long index) +bool __rpte_sub_valid(real_pte_t rpte, unsigned long index) { - unsigned long g_idx; - - if (!(ptev & H_PAGE_COMBO)) - return ptev; - index = index >> 2; - g_idx = 0x1 << index; - - return ptev | (g_idx << H_PAGE_F_GIX_SHIFT); + return !(hpte_soft_invalid(rpte.hidx >> (index << 2))); } int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid, @@ -50,12 +38,11 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid, int ssize, int subpg_prot) { real_pte_t rpte; - unsigned long *hidxp; unsigned long hpte_group; unsigned int subpg_index; - unsigned long rflags, pa, hidx; + unsigned long rflags, pa; unsigned long old_pte, new_pte, subpg_pte; - unsigned long vpn, hash, slot; + unsigned long vpn, hash, slot, gslot; unsigned long shift = mmu_psize_defs[MMU_PAGE_4K].shift; /* @@ -126,18 +113,13 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid, if (__rpte_sub_valid(rpte, subpg_index)) { int ret; - hash = hpt_hash(vpn, shift, ssize); - hidx = __rpte_to_hidx(rpte, subpg_index); - if (hidx & _PTEIDX_SECONDARY) - hash = ~hash; - slot = (hash & htab_hash_mask) * HPTES_PER_GROUP; - slot += hidx & _PTEIDX_GROUP_IX; + gslot = pte_get_hash_gslot(vpn, shift, ssize, rpte, + subpg_index); + ret = mmu_hash_ops.hpte_updatepp(gslot, rflags, vpn, + MMU_PAGE_4K, MMU_PAGE_4K, ssize, flags); - ret = mmu_hash_ops.hpte_updatepp(slot, rflags, vpn, - MMU_PAGE_4K, MMU_PAGE_4K, - ssize, flags); /* - *if we failed because typically the HPTE wasn't really here + * if we failed because typically the HPTE wasn't really here * we try an insertion. */ if (ret == -1) @@ -148,6 +130,15 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid, } htab_insert_hpte: + + /* + * initialize all hidx entries to invalid value, + * the first time the PTE is about to allocate + * a 4K hpte + */ + if (!(old_pte & H_PAGE_COMBO)) + rpte.hidx = ~0x0UL; + /* * handle H_PAGE_4K_PFN case */ @@ -172,15 +163,41 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid, * Primary is full, try the secondary */ if (unlikely(slot == -1)) { + bool soft_invalid; + hpte_group = ((~hash & htab_hash_mask) * HPTES_PER_GROUP) & ~0x7UL; slot = mmu_hash_ops.hpte_insert(hpte_group, vpn, pa, rflags, HPTE_V_SECONDARY, MMU_PAGE_4K, MMU_PAGE_4K, ssize); - if (slot == -1) { - if (mftb() & 0x1) + + soft_invalid = hpte_soft_invalid(slot); + if (unlikely(soft_invalid)) { + /* + * we got a valid slot from a hardware point of view. + * but we cannot use it, because we use this special + * value; as defined by hpte_soft_invalid(), + * to track invalid slots. We cannot use it. + * So invalidate it. + */ + gslot = slot & _PTEIDX_GROUP_IX; + mmu_hash_ops.hpte_invalidate(hpte_group+gslot, vpn, + MMU_PAGE_4K, MMU_PAGE_4K, + ssize, 0); + } + + if (unlikely(slot == -1 || soft_invalid)) { + /* + * for soft invalid slot, lets ensure that we + * release a slot from the primary, with the + * hope that we will acquire that slot next + * time we try. This will ensure that we do not + * get the same soft-invalid slot. + */ + if (soft_invalid || (mftb() & 0x1)) hpte_group = ((hash & htab_hash_mask) * HPTES_PER_GROUP) & ~0x7UL; + mmu_hash_ops.hpte_remove(hpte_group); /* * FIXME!! Should be try the group from which we removed ? @@ -198,21 +215,10 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid, MMU_PAGE_4K, MMU_PAGE_4K, old_pte); return -1; } - /* - * Insert slot number & secondary bit in PTE second half, - * clear H_PAGE_BUSY and set appropriate HPTE slot bit - * Since we have H_PAGE_BUSY set on ptep, we can be sure - * nobody is undating hidx. - */ - hidxp = (unsigned long *)(ptep + PTRS_PER_PTE); - rpte.hidx &= ~(0xfUL << (subpg_index << 2)); - *hidxp = rpte.hidx | (slot << (subpg_index << 2)); - new_pte = mark_subptegroup_valid(new_pte, subpg_index); - new_pte |= H_PAGE_HASHPTE; - /* - * check __real_pte for details on matching smp_rmb() - */ - smp_wmb(); + + new_pte |= pte_set_hash_slot(ptep, rpte, subpg_index, slot); + new_pte |= H_PAGE_HASHPTE; + *ptep = __pte(new_pte & ~H_PAGE_BUSY); return 0; } diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c index e68f053..a40c7bc 100644 --- a/arch/powerpc/mm/hash_utils_64.c +++ b/arch/powerpc/mm/hash_utils_64.c @@ -978,8 +978,9 @@ void __init hash__early_init_devtree(void) void __init hash__early_init_mmu(void) { +#ifndef CONFIG_PPC_64K_PAGES /* - * We have code in __hash_page_64K() and elsewhere, which assumes it can + * We have code in __hash_page_4K() and elsewhere, which assumes it can * do the following: * new_pte |= (slot << H_PAGE_F_GIX_SHIFT) & (H_PAGE_F_SECOND | H_PAGE_F_GIX); * @@ -990,6 +991,7 @@ void __init hash__early_init_mmu(void) * with a BUILD_BUG_ON(). */ BUILD_BUG_ON(H_PAGE_F_SECOND != (1ul << (H_PAGE_F_GIX_SHIFT + 3))); +#endif /* CONFIG_PPC_64K_PAGES */ htab_init_page_sizes();