[{"id":1768277,"web_url":"http://patchwork.ozlabs.org/comment/1768277/","msgid":"<20170914114449.40446d96@firefly.ozlabs.ibm.com>","date":"2017-09-14T01:44:49","subject":"Re: [PATCH 4/7] powerpc: Free up four 64K PTE bits in 64K backed\n\tHPTE pages","submitter":{"id":9347,"url":"http://patchwork.ozlabs.org/api/people/9347/","name":"Balbir Singh","email":"bsingharora@gmail.com"},"content":"On Fri,  8 Sep 2017 15:44:44 -0700\nRam Pai <linuxram@us.ibm.com> wrote:\n\n> Rearrange 64K PTE bits to  free  up  bits 3, 4, 5  and  6\n> in the 64K backed HPTE pages. This along with the earlier\n> patch will  entirely free  up the four bits from 64K PTE.\n> The bit numbers are  big-endian as defined in the  ISA3.0\n> \n> This patch  does  the  following change to 64K PTE backed\n> by 64K HPTE.\n> \n> H_PAGE_F_SECOND (S) which  occupied  bit  4  moves to the\n> \tsecond part of the pte to bit 60.\n> H_PAGE_F_GIX (G,I,X) which  occupied  bit 5, 6 and 7 also\n> \tmoves  to  the   second part of the pte to bit 61,\n>        \t62, 63, 64 respectively\n> \n> since bit 7 is now freed up, we move H_PAGE_BUSY (B) from\n> bit  9  to  bit  7.\n> \n> The second part of the PTE will hold\n> (H_PAGE_F_SECOND|H_PAGE_F_GIX) at bit 60,61,62,63.\n> NOTE: None of the bits in the secondary PTE were not used\n> by 64k-HPTE backed PTE.\n> \n> Before the patch, the 64K HPTE backed 64k PTE format was\n> as follows\n> \n>  0 1 2 3 4  5  6  7  8 9 10...........................63\n>  : : : : :  :  :  :  : : :                            :\n>  v v v v v  v  v  v  v v v                            v\n> \n> ,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,\n> |x|x|x| |S |G |I |X |x|B| |x|x|................|x|x|x|x| <- primary pte\n> '_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'\n> | | | | |  |  |  |  | | | | |..................| | | | | <- secondary pte\n> '_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'\n> \n> After the patch, the 64k HPTE backed 64k PTE format is\n> as follows\n> \n>  0 1 2 3 4  5  6  7  8 9 10...........................63\n>  : : : : :  :  :  :  : : :                            :\n>  v v v v v  v  v  v  v v v                            v\n> \n> ,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,\n> |x|x|x| |  |  |  |B |x| | |x|x|................|.|.|.|.| <- primary pte\n> '_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'\n> | | | | |  |  |  |  | | | | |..................|S|G|I|X| <- secondary pte\n> '_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'\n> \n> The above PTE changes is applicable to hugetlbpages aswell.\n> \n> The patch does the following code changes:\n> \n> a) moves  the  H_PAGE_F_SECOND and  H_PAGE_F_GIX to 4k PTE\n> \theader   since it is no more needed b the 64k PTEs.\n> b) abstracts  out __real_pte() and __rpte_to_hidx() so the\n> \tcaller  need not know the bit location of the slot.\n> c) moves the slot bits to the secondary pte.\n> \n> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>\n> Signed-off-by: Ram Pai <linuxram@us.ibm.com>\n> ---\n>  arch/powerpc/include/asm/book3s/64/hash-4k.h  |    3 ++\n>  arch/powerpc/include/asm/book3s/64/hash-64k.h |   29 +++++++++++-------------\n>  arch/powerpc/include/asm/book3s/64/hash.h     |    3 --\n>  arch/powerpc/mm/hash64_64k.c                  |   23 ++++++++-----------\n>  arch/powerpc/mm/hugetlbpage-hash64.c          |   18 ++++++---------\n>  5 files changed, 33 insertions(+), 43 deletions(-)\n> \n> diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h\n> index e66bfeb..dc153c6 100644\n> --- a/arch/powerpc/include/asm/book3s/64/hash-4k.h\n> +++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h\n> @@ -16,6 +16,9 @@\n>  #define H_PUD_TABLE_SIZE\t(sizeof(pud_t) << H_PUD_INDEX_SIZE)\n>  #define H_PGD_TABLE_SIZE\t(sizeof(pgd_t) << H_PGD_INDEX_SIZE)\n>  \n> +#define H_PAGE_F_GIX_SHIFT\t56\n> +#define H_PAGE_F_SECOND\t_RPAGE_RSV2\t/* HPTE is in 2ndary HPTEG */\n> +#define H_PAGE_F_GIX\t(_RPAGE_RSV3 | _RPAGE_RSV4 | _RPAGE_RPN44)\n>  #define H_PAGE_BUSY\t_RPAGE_RSV1     /* software: PTE & hash are busy */\n>  \n>  /* PTE flags to conserve for HPTE identification */\n> diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h\n> index e038f1c..89ef5a9 100644\n> --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h\n> +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h\n> @@ -12,7 +12,7 @@\n>   */\n>  #define H_PAGE_COMBO\t_RPAGE_RPN0 /* this is a combo 4k page */\n>  #define H_PAGE_4K_PFN\t_RPAGE_RPN1 /* PFN is for a single 4k page */\n> -#define H_PAGE_BUSY\t_RPAGE_RPN42     /* software: PTE & hash are busy */\n> +#define H_PAGE_BUSY\t_RPAGE_RPN44     /* software: PTE & hash are busy */\n>  \n>  /*\n>   * We need to differentiate between explicit huge page and THP huge\n> @@ -21,8 +21,7 @@\n>  #define H_PAGE_THP_HUGE  H_PAGE_4K_PFN\n>  \n>  /* PTE flags to conserve for HPTE identification */\n> -#define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_F_SECOND | \\\n> -\t\t\t H_PAGE_F_GIX | H_PAGE_HASHPTE | H_PAGE_COMBO)\n> +#define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_HASHPTE | H_PAGE_COMBO)\n>  /*\n>   * we support 16 fragments per PTE page of 64K size.\n>   */\n> @@ -50,24 +49,22 @@ static inline real_pte_t __real_pte(pte_t pte, pte_t *ptep)\n>  \tunsigned long *hidxp;\n>  \n>  \trpte.pte = pte;\n> -\trpte.hidx = 0;\n> -\tif (pte_val(pte) & H_PAGE_COMBO) {\n> -\t\t/*\n> -\t\t * Make sure we order the hidx load against the H_PAGE_COMBO\n> -\t\t * check. The store side ordering is done in __hash_page_4K\n> -\t\t */\n> -\t\tsmp_rmb();\n> -\t\thidxp = (unsigned long *)(ptep + PTRS_PER_PTE);\n> -\t\trpte.hidx = *hidxp;\n> -\t}\n> +\t/*\n> +\t * Ensure that we do not read the hidx before we read\n> +\t * the pte. Because the writer side is  expected\n> +\t * to finish writing the hidx first followed by the pte,\n> +\t * by using smp_wmb().\n> +\t * pte_set_hash_slot() ensures that.\n> +\t */\n> +\tsmp_rmb();\n> +\thidxp = (unsigned long *)(ptep + PTRS_PER_PTE);\n> +\trpte.hidx = *hidxp;\n>  \treturn rpte;\n>  }\n>  \n>  static inline unsigned long __rpte_to_hidx(real_pte_t rpte, unsigned long index)\n>  {\n> -\tif ((pte_val(rpte.pte) & H_PAGE_COMBO))\n> -\t\treturn (rpte.hidx >> (index<<2)) & 0xf;\n> -\treturn (pte_val(rpte.pte) >> H_PAGE_F_GIX_SHIFT) & 0xf;\n> +\treturn ((rpte.hidx >> (index<<2)) & 0xfUL);\n>  }\n>  \n>  /*\n> diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h\n> index 8ce4112..46f3a23 100644\n> --- a/arch/powerpc/include/asm/book3s/64/hash.h\n> +++ b/arch/powerpc/include/asm/book3s/64/hash.h\n> @@ -8,9 +8,6 @@\n>   *\n>   */\n>  #define H_PTE_NONE_MASK\t\t_PAGE_HPTEFLAGS\n> -#define H_PAGE_F_GIX_SHIFT\t56\n> -#define H_PAGE_F_SECOND\t\t_RPAGE_RSV2\t/* HPTE is in 2ndary HPTEG */\n> -#define H_PAGE_F_GIX\t\t(_RPAGE_RSV3 | _RPAGE_RSV4 | _RPAGE_RPN44)\n>  #define H_PAGE_HASHPTE\t\t_RPAGE_RPN43\t/* PTE has associated HPTE */\n>  \n>  #ifdef CONFIG_PPC_64K_PAGES\n> diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c\n> index c6c5559..9c63844 100644\n> --- a/arch/powerpc/mm/hash64_64k.c\n> +++ b/arch/powerpc/mm/hash64_64k.c\n> @@ -103,8 +103,8 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,\n>  \t\t * On hash insert failure we use old pte value and we don't\n>  \t\t * want slot information there if we have a insert failure.\n>  \t\t */\n> -\t\told_pte &= ~(H_PAGE_HASHPTE | H_PAGE_F_GIX | H_PAGE_F_SECOND);\n> -\t\tnew_pte &= ~(H_PAGE_HASHPTE | H_PAGE_F_GIX | H_PAGE_F_SECOND);\n> +\t\told_pte &= ~H_PAGE_HASHPTE;\n> +\t\tnew_pte &= ~H_PAGE_HASHPTE;\n\nShouldn't we set old/new_pte.slot = invalid? via rpte.hidx\n\n>  \t\tgoto htab_insert_hpte;\n>  \t}\n>  \t/*\n> @@ -227,6 +227,7 @@ int __hash_page_64K(unsigned long ea, unsigned long access,\n>  \t\t    unsigned long vsid, pte_t *ptep, unsigned long trap,\n>  \t\t    unsigned long flags, int ssize)\n>  {\n> +\treal_pte_t rpte;\n>  \tunsigned long hpte_group;\n>  \tunsigned long rflags, pa;\n>  \tunsigned long old_pte, new_pte;\n> @@ -263,6 +264,7 @@ int __hash_page_64K(unsigned long ea, unsigned long access,\n>  \t} while (!pte_xchg(ptep, __pte(old_pte), __pte(new_pte)));\n>  \n>  \trflags = htab_convert_pte_flags(new_pte);\n> +\trpte = __real_pte(__pte(old_pte), ptep);\n>  \n>  \tif (cpu_has_feature(CPU_FTR_NOEXECUTE) &&\n>  \t    !cpu_has_feature(CPU_FTR_COHERENT_ICACHE))\n> @@ -270,18 +272,13 @@ int __hash_page_64K(unsigned long ea, unsigned long access,\n>  \n>  \tvpn  = hpt_vpn(ea, vsid, ssize);\n>  \tif (unlikely(old_pte & H_PAGE_HASHPTE)) {\n> +\t\tunsigned long gslot;\n>  \t\t/*\n>  \t\t * There MIGHT be an HPTE for this pte\n>  \t\t */\n> -\t\thash = hpt_hash(vpn, shift, ssize);\n> -\t\tif (old_pte & H_PAGE_F_SECOND)\n> -\t\t\thash = ~hash;\n> -\t\tslot = (hash & htab_hash_mask) * HPTES_PER_GROUP;\n> -\t\tslot += (old_pte & H_PAGE_F_GIX) >> H_PAGE_F_GIX_SHIFT;\n> -\n> -\t\tif (mmu_hash_ops.hpte_updatepp(slot, rflags, vpn, MMU_PAGE_64K,\n> -\t\t\t\t\t       MMU_PAGE_64K, ssize,\n> -\t\t\t\t\t       flags) == -1)\n> +\t\tgslot = pte_get_hash_gslot(vpn, shift, ssize, rpte, 0);\n> +\t\tif (mmu_hash_ops.hpte_updatepp(gslot, rflags, vpn, MMU_PAGE_64K,\n> +\t\t\t\tMMU_PAGE_64K, ssize, flags) == -1)\n>  \t\t\told_pte &= ~_PAGE_HPTEFLAGS;\n>  \t}\n>  \n> @@ -328,9 +325,9 @@ int __hash_page_64K(unsigned long ea, unsigned long access,\n>  \t\t\t\t\t   MMU_PAGE_64K, MMU_PAGE_64K, old_pte);\n>  \t\t\treturn -1;\n>  \t\t}\n> +\n>  \t\tnew_pte = (new_pte & ~_PAGE_HPTEFLAGS) | H_PAGE_HASHPTE;\n> -\t\tnew_pte |= (slot << H_PAGE_F_GIX_SHIFT) &\n> -\t\t\t(H_PAGE_F_SECOND | H_PAGE_F_GIX);\n> +\t\tnew_pte |= pte_set_hash_slot(ptep, rpte, 0, slot);\n>  \t}\n>  \t*ptep = __pte(new_pte & ~H_PAGE_BUSY);\n>  \treturn 0;\n> diff --git a/arch/powerpc/mm/hugetlbpage-hash64.c b/arch/powerpc/mm/hugetlbpage-hash64.c\n> index a84bb44..d52d667 100644\n> --- a/arch/powerpc/mm/hugetlbpage-hash64.c\n> +++ b/arch/powerpc/mm/hugetlbpage-hash64.c\n> @@ -22,6 +22,7 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,\n>  \t\t     pte_t *ptep, unsigned long trap, unsigned long flags,\n>  \t\t     int ssize, unsigned int shift, unsigned int mmu_psize)\n>  {\n> +\treal_pte_t rpte;\n>  \tunsigned long vpn;\n>  \tunsigned long old_pte, new_pte;\n>  \tunsigned long rflags, pa, sz;\n> @@ -61,6 +62,7 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,\n>  \t} while(!pte_xchg(ptep, __pte(old_pte), __pte(new_pte)));\n>  \n>  \trflags = htab_convert_pte_flags(new_pte);\n> +\trpte = __real_pte(__pte(old_pte), ptep);\n>  \n>  \tsz = ((1UL) << shift);\n>  \tif (!cpu_has_feature(CPU_FTR_COHERENT_ICACHE))\n> @@ -71,16 +73,11 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,\n>  \t/* Check if pte already has an hpte (case 2) */\n>  \tif (unlikely(old_pte & H_PAGE_HASHPTE)) {\n>  \t\t/* There MIGHT be an HPTE for this pte */\n> -\t\tunsigned long hash, slot;\n> +\t\tunsigned long gslot;\n>  \n> -\t\thash = hpt_hash(vpn, shift, ssize);\n> -\t\tif (old_pte & H_PAGE_F_SECOND)\n> -\t\t\thash = ~hash;\n> -\t\tslot = (hash & htab_hash_mask) * HPTES_PER_GROUP;\n> -\t\tslot += (old_pte & H_PAGE_F_GIX) >> H_PAGE_F_GIX_SHIFT;\n> -\n> -\t\tif (mmu_hash_ops.hpte_updatepp(slot, rflags, vpn, mmu_psize,\n> -\t\t\t\t\t       mmu_psize, ssize, flags) == -1)\n> +\t\tgslot = pte_get_hash_gslot(vpn, shift, ssize, rpte, 0);\n> +\t\tif (mmu_hash_ops.hpte_updatepp(gslot, rflags, vpn, mmu_psize,\n> +\t\t\t\tmmu_psize, ssize, flags) == -1)\n>  \t\t\told_pte &= ~_PAGE_HPTEFLAGS;\n>  \t}\n>  \n> @@ -106,8 +103,7 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,\n>  \t\t\treturn -1;\n>  \t\t}\n>  \n> -\t\tnew_pte |= (slot << H_PAGE_F_GIX_SHIFT) &\n> -\t\t\t(H_PAGE_F_SECOND | H_PAGE_F_GIX);\n> +\t\tnew_pte |= pte_set_hash_slot(ptep, rpte, 0, slot);\n>  \t}\n>  \n>  \t/*\n\nBalbir","headers":{"Return-Path":"<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>","X-Original-To":["patchwork-incoming@ozlabs.org","linuxppc-dev@lists.ozlabs.org"],"Delivered-To":["patchwork-incoming@ozlabs.org","linuxppc-dev@lists.ozlabs.org"],"Received":["from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68])\n\t(using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3xt1bm0Jbcz9t2l\n\tfor <patchwork-incoming@ozlabs.org>;\n\tThu, 14 Sep 2017 11:46:44 +1000 (AEST)","from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\tby lists.ozlabs.org (Postfix) with ESMTP id 3xt1bl6G24zDqlR\n\tfor <patchwork-incoming@ozlabs.org>;\n\tThu, 14 Sep 2017 11:46:43 +1000 (AEST)","from mail-pg0-x242.google.com (mail-pg0-x242.google.com\n\t[IPv6:2607:f8b0:400e:c05::242])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128\n\tbits)) (No client certificate requested)\n\tby lists.ozlabs.org (Postfix) with ESMTPS id 3xt1Yn5TxyzDqj8\n\tfor <linuxppc-dev@lists.ozlabs.org>;\n\tThu, 14 Sep 2017 11:45:01 +1000 (AEST)","by mail-pg0-x242.google.com with SMTP id m30so894970pgn.5\n\tfor <linuxppc-dev@lists.ozlabs.org>;\n\tWed, 13 Sep 2017 18:45:01 -0700 (PDT)","from firefly.ozlabs.ibm.com ([122.99.82.10])\n\tby smtp.gmail.com with ESMTPSA id\n\tg16sm26859601pfd.6.2017.09.13.18.44.55\n\t(version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256);\n\tWed, 13 Sep 2017 18:44:59 -0700 (PDT)"],"Authentication-Results":["ozlabs.org;\n\tdkim=fail reason=\"signature verification failed\" (2048-bit key;\n\tunprotected) header.d=gmail.com header.i=@gmail.com\n\theader.b=\"eA7a1lI2\"; dkim-atps=neutral","lists.ozlabs.org;\n\tdkim=fail reason=\"signature verification failed\" (2048-bit key;\n\tunprotected) header.d=gmail.com header.i=@gmail.com\n\theader.b=\"eA7a1lI2\"; dkim-atps=neutral","ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=gmail.com\n\t(client-ip=2607:f8b0:400e:c05::242; helo=mail-pg0-x242.google.com;\n\tenvelope-from=bsingharora@gmail.com; receiver=<UNKNOWN>)","lists.ozlabs.org; dkim=pass (2048-bit key;\n\tunprotected) header.d=gmail.com header.i=@gmail.com\n\theader.b=\"eA7a1lI2\"; dkim-atps=neutral"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;\n\th=date:from:to:cc:subject:message-id:in-reply-to:references\n\t:mime-version:content-transfer-encoding;\n\tbh=Cll92B+lGf52KT7wm3uEMbFFw8tWfL002z5HEYqCzy8=;\n\tb=eA7a1lI2LOw300RlbEoAeoiRfJJgPUfR4dSDMLcWHumlh0pEDY09ezBe2DcZIAAg+p\n\t4qcU8yxg+38VToptBffQzjq7zoIClhK30utwMcMwNWga2QZChXWzNQIye7QZ31y01wWd\n\thgLggk1ljx6Wg/MYoLHx7P2e/K2l/TjIU2SSn3er6VLK+GPDaxOnUN/bwxA3ulf/OWWg\n\tByikT6X+kPyYQ5BbhreWY0eGiFBoDRp5eHxeQ1IxVqmO5/bt6LlJBRt9CZ3l5rIlYlFQ\n\t0621BxMX4NQmycs+2/RJQ97r6iibuRhWvraM2OxmDYAQeJ9WfGwZYN5OIYXg8a2otxQw\n\t6IOw==","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=1e100.net; s=20161025;\n\th=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to\n\t:references:mime-version:content-transfer-encoding;\n\tbh=Cll92B+lGf52KT7wm3uEMbFFw8tWfL002z5HEYqCzy8=;\n\tb=JH9uO5/YYckBrFYCLJCMEROtELUCmVaYLpnJTt2Pk6vI+D+RXBDp5H0igQuZMyPmfx\n\txfOMnNXYn4l5nfVVeXuXYEabb9NJGDZCSm4hj8DO2mCNb7rst0+fr4CfIIkf30LPfqYP\n\trjnDJeKJNpTKycTRCt5ncPZVXg+rXeSGVoGwu0t5eVD/Z8b/EQzfU8B3VjC6lO44YMrA\n\tzrTwnA81mLSi6C/p080TNNEZQUau8sJR2pBfOJFXIBb0sZh3AQNd8aZ8eVcYVss+ndzK\n\tlvNNiR+uEy96U8Q4/HddzMh+j5scgdd8eMNKrWRrMQD1I+A4mvj4vl5QWNeFwa92zwBN\n\t14Tg==","X-Gm-Message-State":"AHPjjUgpbighFWoN5RMySa0cisI41+3yADLYzDS47EH7rZHY5Jx9sUfd\n\tvQpeq+1jDxqICg==","X-Google-Smtp-Source":"ADKCNb6Lxtk++U+FD7k0Yf7F638iQJvAbswOlq1cvt1nbfM48+sCdyxNZWWDccgeEkVU6BO4eOheKQ==","X-Received":"by 10.84.132.99 with SMTP id 90mr22555110ple.406.1505353499337; \n\tWed, 13 Sep 2017 18:44:59 -0700 (PDT)","Date":"Thu, 14 Sep 2017 11:44:49 +1000","From":"Balbir Singh <bsingharora@gmail.com>","To":"Ram Pai <linuxram@us.ibm.com>","Subject":"Re: [PATCH 4/7] powerpc: Free up four 64K PTE bits in 64K backed\n\tHPTE pages","Message-ID":"<20170914114449.40446d96@firefly.ozlabs.ibm.com>","In-Reply-To":"<1504910713-7094-5-git-send-email-linuxram@us.ibm.com>","References":"<1504910713-7094-1-git-send-email-linuxram@us.ibm.com>\n\t<1504910713-7094-5-git-send-email-linuxram@us.ibm.com>","X-Mailer":"Claws Mail 3.14.1 (GTK+ 2.24.31; x86_64-redhat-linux-gnu)","MIME-Version":"1.0","Content-Type":"text/plain; charset=US-ASCII","Content-Transfer-Encoding":"7bit","X-BeenThere":"linuxppc-dev@lists.ozlabs.org","X-Mailman-Version":"2.1.24","Precedence":"list","List-Id":"Linux on PowerPC Developers Mail List\n\t<linuxppc-dev.lists.ozlabs.org>","List-Unsubscribe":"<https://lists.ozlabs.org/options/linuxppc-dev>,\n\t<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=unsubscribe>","List-Archive":"<http://lists.ozlabs.org/pipermail/linuxppc-dev/>","List-Post":"<mailto:linuxppc-dev@lists.ozlabs.org>","List-Help":"<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=help>","List-Subscribe":"<https://lists.ozlabs.org/listinfo/linuxppc-dev>,\n\t<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=subscribe>","Cc":"ebiederm@xmission.com, mhocko@kernel.org, paulus@samba.org,\n\taneesh.kumar@linux.vnet.ibm.com, bauerman@linux.vnet.ibm.com,\n\tlinuxppc-dev@lists.ozlabs.org, khandual@linux.vnet.ibm.com","Errors-To":"linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org","Sender":"\"Linuxppc-dev\"\n\t<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>"}},{"id":1768411,"web_url":"http://patchwork.ozlabs.org/comment/1768411/","msgid":"<1505376837.12628.192.camel@kernel.crashing.org>","date":"2017-09-14T08:13:57","subject":"Re: [PATCH 4/7] powerpc: Free up four 64K PTE bits in 64K backed\n\tHPTE pages","submitter":{"id":38,"url":"http://patchwork.ozlabs.org/api/people/38/","name":"Benjamin Herrenschmidt","email":"benh@kernel.crashing.org"},"content":"On Fri, 2017-09-08 at 15:44 -0700, Ram Pai wrote:\n> The second part of the PTE will hold\n> (H_PAGE_F_SECOND|H_PAGE_F_GIX) at bit 60,61,62,63.\n> NOTE: None of the bits in the secondary PTE were not used\n> by 64k-HPTE backed PTE.\n\nHave you measured the performance impact of this ? The second part of\nthe PTE being in a different cache line there could be one...\n\nCheers,\nBen.","headers":{"Return-Path":"<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>","X-Original-To":["patchwork-incoming@ozlabs.org","linuxppc-dev@lists.ozlabs.org"],"Delivered-To":["patchwork-incoming@ozlabs.org","linuxppc-dev@lists.ozlabs.org"],"Received":["from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\t(using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3xtBGf57ysz9s7v\n\tfor <patchwork-incoming@ozlabs.org>;\n\tThu, 14 Sep 2017 18:17:30 +1000 (AEST)","from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\tby lists.ozlabs.org (Postfix) with ESMTP id 3xtBGf3rJ8zDqYd\n\tfor <patchwork-incoming@ozlabs.org>;\n\tThu, 14 Sep 2017 18:17:30 +1000 (AEST)","from gate.crashing.org (gate.crashing.org [63.228.1.57])\n\t(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))\n\t(No client certificate requested)\n\tby lists.ozlabs.org (Postfix) with ESMTPS id 3xtBC03ndszDrSW\n\tfor <linuxppc-dev@lists.ozlabs.org>;\n\tThu, 14 Sep 2017 18:14:20 +1000 (AEST)","from localhost (localhost.localdomain [127.0.0.1])\n\tby gate.crashing.org (8.14.1/8.13.8) with ESMTP id v8E8Dv2L022857;\n\tThu, 14 Sep 2017 03:13:58 -0500"],"Authentication-Results":"ozlabs.org; spf=permerror (mailfrom)\n\tsmtp.mailfrom=kernel.crashing.org (client-ip=63.228.1.57;\n\thelo=gate.crashing.org; envelope-from=benh@kernel.crashing.org;\n\treceiver=<UNKNOWN>)","Message-ID":"<1505376837.12628.192.camel@kernel.crashing.org>","Subject":"Re: [PATCH 4/7] powerpc: Free up four 64K PTE bits in 64K backed\n\tHPTE pages","From":"Benjamin Herrenschmidt <benh@kernel.crashing.org>","To":"Ram Pai <linuxram@us.ibm.com>, mpe@ellerman.id.au,\n\tlinuxppc-dev@lists.ozlabs.org","Date":"Thu, 14 Sep 2017 18:13:57 +1000","In-Reply-To":"<1504910713-7094-5-git-send-email-linuxram@us.ibm.com>","References":"<1504910713-7094-1-git-send-email-linuxram@us.ibm.com>\n\t<1504910713-7094-5-git-send-email-linuxram@us.ibm.com>","Content-Type":"text/plain; charset=\"UTF-8\"","X-Mailer":"Evolution 3.24.5 (3.24.5-1.fc26) ","Mime-Version":"1.0","Content-Transfer-Encoding":"7bit","X-BeenThere":"linuxppc-dev@lists.ozlabs.org","X-Mailman-Version":"2.1.24","Precedence":"list","List-Id":"Linux on PowerPC Developers Mail List\n\t<linuxppc-dev.lists.ozlabs.org>","List-Unsubscribe":"<https://lists.ozlabs.org/options/linuxppc-dev>,\n\t<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=unsubscribe>","List-Archive":"<http://lists.ozlabs.org/pipermail/linuxppc-dev/>","List-Post":"<mailto:linuxppc-dev@lists.ozlabs.org>","List-Help":"<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=help>","List-Subscribe":"<https://lists.ozlabs.org/listinfo/linuxppc-dev>,\n\t<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=subscribe>","Cc":"ebiederm@xmission.com, mhocko@kernel.org, paulus@samba.org,\n\taneesh.kumar@linux.vnet.ibm.com, bauerman@linux.vnet.ibm.com,\n\tkhandual@linux.vnet.ibm.com","Errors-To":"linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org","Sender":"\"Linuxppc-dev\"\n\t<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>"}},{"id":1768760,"web_url":"http://patchwork.ozlabs.org/comment/1768760/","msgid":"<20170914175408.GF5698@ram.oc3035372033.ibm.com>","date":"2017-09-14T17:54:09","subject":"Re: [PATCH 4/7] powerpc: Free up four 64K PTE bits in 64K backed\n\tHPTE pages","submitter":{"id":2667,"url":"http://patchwork.ozlabs.org/api/people/2667/","name":"Ram Pai","email":"linuxram@us.ibm.com"},"content":"On Thu, Sep 14, 2017 at 11:44:49AM +1000, Balbir Singh wrote:\n> On Fri,  8 Sep 2017 15:44:44 -0700\n> Ram Pai <linuxram@us.ibm.com> wrote:\n> \n> > Rearrange 64K PTE bits to  free  up  bits 3, 4, 5  and  6\n> > in the 64K backed HPTE pages. This along with the earlier\n> > patch will  entirely free  up the four bits from 64K PTE.\n> > The bit numbers are  big-endian as defined in the  ISA3.0\n> > \n> > This patch  does  the  following change to 64K PTE backed\n> > by 64K HPTE.\n> > \n> > H_PAGE_F_SECOND (S) which  occupied  bit  4  moves to the\n> > \tsecond part of the pte to bit 60.\n> > H_PAGE_F_GIX (G,I,X) which  occupied  bit 5, 6 and 7 also\n> > \tmoves  to  the   second part of the pte to bit 61,\n> >        \t62, 63, 64 respectively\n> > \n> > since bit 7 is now freed up, we move H_PAGE_BUSY (B) from\n> > bit  9  to  bit  7.\n> > \n> > The second part of the PTE will hold\n> > (H_PAGE_F_SECOND|H_PAGE_F_GIX) at bit 60,61,62,63.\n> > NOTE: None of the bits in the secondary PTE were not used\n> > by 64k-HPTE backed PTE.\n> > \n> > Before the patch, the 64K HPTE backed 64k PTE format was\n> > as follows\n> > \n> >  0 1 2 3 4  5  6  7  8 9 10...........................63\n> >  : : : : :  :  :  :  : : :                            :\n> >  v v v v v  v  v  v  v v v                            v\n> > \n> > ,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,\n> > |x|x|x| |S |G |I |X |x|B| |x|x|................|x|x|x|x| <- primary pte\n> > '_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'\n> > | | | | |  |  |  |  | | | | |..................| | | | | <- secondary pte\n> > '_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'\n> > \n> > After the patch, the 64k HPTE backed 64k PTE format is\n> > as follows\n> > \n> >  0 1 2 3 4  5  6  7  8 9 10...........................63\n> >  : : : : :  :  :  :  : : :                            :\n> >  v v v v v  v  v  v  v v v                            v\n> > \n> > ,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,\n> > |x|x|x| |  |  |  |B |x| | |x|x|................|.|.|.|.| <- primary pte\n> > '_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'\n> > | | | | |  |  |  |  | | | | |..................|S|G|I|X| <- secondary pte\n> > '_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'\n> > \n> > The above PTE changes is applicable to hugetlbpages aswell.\n> > \n> > The patch does the following code changes:\n> > \n> > a) moves  the  H_PAGE_F_SECOND and  H_PAGE_F_GIX to 4k PTE\n> > \theader   since it is no more needed b the 64k PTEs.\n> > b) abstracts  out __real_pte() and __rpte_to_hidx() so the\n> > \tcaller  need not know the bit location of the slot.\n> > c) moves the slot bits to the secondary pte.\n> > \n> > Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>\n> > Signed-off-by: Ram Pai <linuxram@us.ibm.com>\n> > ---\n> >  arch/powerpc/include/asm/book3s/64/hash-4k.h  |    3 ++\n> >  arch/powerpc/include/asm/book3s/64/hash-64k.h |   29 +++++++++++-------------\n> >  arch/powerpc/include/asm/book3s/64/hash.h     |    3 --\n> >  arch/powerpc/mm/hash64_64k.c                  |   23 ++++++++-----------\n> >  arch/powerpc/mm/hugetlbpage-hash64.c          |   18 ++++++---------\n> >  5 files changed, 33 insertions(+), 43 deletions(-)\n> > \n> > diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h\n> > index e66bfeb..dc153c6 100644\n> > --- a/arch/powerpc/include/asm/book3s/64/hash-4k.h\n> > +++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h\n> > @@ -16,6 +16,9 @@\n> >  #define H_PUD_TABLE_SIZE\t(sizeof(pud_t) << H_PUD_INDEX_SIZE)\n> >  #define H_PGD_TABLE_SIZE\t(sizeof(pgd_t) << H_PGD_INDEX_SIZE)\n> >  \n> > +#define H_PAGE_F_GIX_SHIFT\t56\n> > +#define H_PAGE_F_SECOND\t_RPAGE_RSV2\t/* HPTE is in 2ndary HPTEG */\n> > +#define H_PAGE_F_GIX\t(_RPAGE_RSV3 | _RPAGE_RSV4 | _RPAGE_RPN44)\n> >  #define H_PAGE_BUSY\t_RPAGE_RSV1     /* software: PTE & hash are busy */\n> >  \n> >  /* PTE flags to conserve for HPTE identification */\n> > diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h\n> > index e038f1c..89ef5a9 100644\n> > --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h\n> > +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h\n> > @@ -12,7 +12,7 @@\n> >   */\n> >  #define H_PAGE_COMBO\t_RPAGE_RPN0 /* this is a combo 4k page */\n> >  #define H_PAGE_4K_PFN\t_RPAGE_RPN1 /* PFN is for a single 4k page */\n> > -#define H_PAGE_BUSY\t_RPAGE_RPN42     /* software: PTE & hash are busy */\n> > +#define H_PAGE_BUSY\t_RPAGE_RPN44     /* software: PTE & hash are busy */\n> >  \n> >  /*\n> >   * We need to differentiate between explicit huge page and THP huge\n> > @@ -21,8 +21,7 @@\n> >  #define H_PAGE_THP_HUGE  H_PAGE_4K_PFN\n> >  \n> >  /* PTE flags to conserve for HPTE identification */\n> > -#define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_F_SECOND | \\\n> > -\t\t\t H_PAGE_F_GIX | H_PAGE_HASHPTE | H_PAGE_COMBO)\n> > +#define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_HASHPTE | H_PAGE_COMBO)\n> >  /*\n> >   * we support 16 fragments per PTE page of 64K size.\n> >   */\n> > @@ -50,24 +49,22 @@ static inline real_pte_t __real_pte(pte_t pte, pte_t *ptep)\n> >  \tunsigned long *hidxp;\n> >  \n> >  \trpte.pte = pte;\n> > -\trpte.hidx = 0;\n> > -\tif (pte_val(pte) & H_PAGE_COMBO) {\n> > -\t\t/*\n> > -\t\t * Make sure we order the hidx load against the H_PAGE_COMBO\n> > -\t\t * check. The store side ordering is done in __hash_page_4K\n> > -\t\t */\n> > -\t\tsmp_rmb();\n> > -\t\thidxp = (unsigned long *)(ptep + PTRS_PER_PTE);\n> > -\t\trpte.hidx = *hidxp;\n> > -\t}\n> > +\t/*\n> > +\t * Ensure that we do not read the hidx before we read\n> > +\t * the pte. Because the writer side is  expected\n> > +\t * to finish writing the hidx first followed by the pte,\n> > +\t * by using smp_wmb().\n> > +\t * pte_set_hash_slot() ensures that.\n> > +\t */\n> > +\tsmp_rmb();\n> > +\thidxp = (unsigned long *)(ptep + PTRS_PER_PTE);\n> > +\trpte.hidx = *hidxp;\n> >  \treturn rpte;\n> >  }\n> >  \n> >  static inline unsigned long __rpte_to_hidx(real_pte_t rpte, unsigned long index)\n> >  {\n> > -\tif ((pte_val(rpte.pte) & H_PAGE_COMBO))\n> > -\t\treturn (rpte.hidx >> (index<<2)) & 0xf;\n> > -\treturn (pte_val(rpte.pte) >> H_PAGE_F_GIX_SHIFT) & 0xf;\n> > +\treturn ((rpte.hidx >> (index<<2)) & 0xfUL);\n> >  }\n> >  \n> >  /*\n> > diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h\n> > index 8ce4112..46f3a23 100644\n> > --- a/arch/powerpc/include/asm/book3s/64/hash.h\n> > +++ b/arch/powerpc/include/asm/book3s/64/hash.h\n> > @@ -8,9 +8,6 @@\n> >   *\n> >   */\n> >  #define H_PTE_NONE_MASK\t\t_PAGE_HPTEFLAGS\n> > -#define H_PAGE_F_GIX_SHIFT\t56\n> > -#define H_PAGE_F_SECOND\t\t_RPAGE_RSV2\t/* HPTE is in 2ndary HPTEG */\n> > -#define H_PAGE_F_GIX\t\t(_RPAGE_RSV3 | _RPAGE_RSV4 | _RPAGE_RPN44)\n> >  #define H_PAGE_HASHPTE\t\t_RPAGE_RPN43\t/* PTE has associated HPTE */\n> >  \n> >  #ifdef CONFIG_PPC_64K_PAGES\n> > diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c\n> > index c6c5559..9c63844 100644\n> > --- a/arch/powerpc/mm/hash64_64k.c\n> > +++ b/arch/powerpc/mm/hash64_64k.c\n> > @@ -103,8 +103,8 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,\n> >  \t\t * On hash insert failure we use old pte value and we don't\n> >  \t\t * want slot information there if we have a insert failure.\n> >  \t\t */\n> > -\t\told_pte &= ~(H_PAGE_HASHPTE | H_PAGE_F_GIX | H_PAGE_F_SECOND);\n> > -\t\tnew_pte &= ~(H_PAGE_HASHPTE | H_PAGE_F_GIX | H_PAGE_F_SECOND);\n> > +\t\told_pte &= ~H_PAGE_HASHPTE;\n> > +\t\tnew_pte &= ~H_PAGE_HASHPTE;\n> \n> Shouldn't we set old/new_pte.slot = invalid? via rpte.hidx\n\nby resetting the H_PAGE_HASHPTE flag, we are invalidating\nslot information.  Would that not be sufficient?\n\nRP\n\n> \n> >  \t\tgoto htab_insert_hpte;\n> >  \t}\n> >  \t/*\n> > @@ -227,6 +227,7 @@ int __hash_page_64K(unsigned long ea, unsigned long access,\n> >  \t\t    unsigned long vsid, pte_t *ptep, unsigned long trap,\n> >  \t\t    unsigned long flags, int ssize)\n> >  {\n> > +\treal_pte_t rpte;\n> >  \tunsigned long hpte_group;\n> >  \tunsigned long rflags, pa;\n> >  \tunsigned long old_pte, new_pte;\n> > @@ -263,6 +264,7 @@ int __hash_page_64K(unsigned long ea, unsigned long access,\n> >  \t} while (!pte_xchg(ptep, __pte(old_pte), __pte(new_pte)));\n> >  \n> >  \trflags = htab_convert_pte_flags(new_pte);\n> > +\trpte = __real_pte(__pte(old_pte), ptep);\n> >  \n> >  \tif (cpu_has_feature(CPU_FTR_NOEXECUTE) &&\n> >  \t    !cpu_has_feature(CPU_FTR_COHERENT_ICACHE))\n> > @@ -270,18 +272,13 @@ int __hash_page_64K(unsigned long ea, unsigned long access,\n> >  \n> >  \tvpn  = hpt_vpn(ea, vsid, ssize);\n> >  \tif (unlikely(old_pte & H_PAGE_HASHPTE)) {\n> > +\t\tunsigned long gslot;\n> >  \t\t/*\n> >  \t\t * There MIGHT be an HPTE for this pte\n> >  \t\t */\n> > -\t\thash = hpt_hash(vpn, shift, ssize);\n> > -\t\tif (old_pte & H_PAGE_F_SECOND)\n> > -\t\t\thash = ~hash;\n> > -\t\tslot = (hash & htab_hash_mask) * HPTES_PER_GROUP;\n> > -\t\tslot += (old_pte & H_PAGE_F_GIX) >> H_PAGE_F_GIX_SHIFT;\n> > -\n> > -\t\tif (mmu_hash_ops.hpte_updatepp(slot, rflags, vpn, MMU_PAGE_64K,\n> > -\t\t\t\t\t       MMU_PAGE_64K, ssize,\n> > -\t\t\t\t\t       flags) == -1)\n> > +\t\tgslot = pte_get_hash_gslot(vpn, shift, ssize, rpte, 0);\n> > +\t\tif (mmu_hash_ops.hpte_updatepp(gslot, rflags, vpn, MMU_PAGE_64K,\n> > +\t\t\t\tMMU_PAGE_64K, ssize, flags) == -1)\n> >  \t\t\told_pte &= ~_PAGE_HPTEFLAGS;\n> >  \t}\n> >  \n> > @@ -328,9 +325,9 @@ int __hash_page_64K(unsigned long ea, unsigned long access,\n> >  \t\t\t\t\t   MMU_PAGE_64K, MMU_PAGE_64K, old_pte);\n> >  \t\t\treturn -1;\n> >  \t\t}\n> > +\n> >  \t\tnew_pte = (new_pte & ~_PAGE_HPTEFLAGS) | H_PAGE_HASHPTE;\n> > -\t\tnew_pte |= (slot << H_PAGE_F_GIX_SHIFT) &\n> > -\t\t\t(H_PAGE_F_SECOND | H_PAGE_F_GIX);\n> > +\t\tnew_pte |= pte_set_hash_slot(ptep, rpte, 0, slot);\n> >  \t}\n> >  \t*ptep = __pte(new_pte & ~H_PAGE_BUSY);\n> >  \treturn 0;\n> > diff --git a/arch/powerpc/mm/hugetlbpage-hash64.c b/arch/powerpc/mm/hugetlbpage-hash64.c\n> > index a84bb44..d52d667 100644\n> > --- a/arch/powerpc/mm/hugetlbpage-hash64.c\n> > +++ b/arch/powerpc/mm/hugetlbpage-hash64.c\n> > @@ -22,6 +22,7 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,\n> >  \t\t     pte_t *ptep, unsigned long trap, unsigned long flags,\n> >  \t\t     int ssize, unsigned int shift, unsigned int mmu_psize)\n> >  {\n> > +\treal_pte_t rpte;\n> >  \tunsigned long vpn;\n> >  \tunsigned long old_pte, new_pte;\n> >  \tunsigned long rflags, pa, sz;\n> > @@ -61,6 +62,7 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,\n> >  \t} while(!pte_xchg(ptep, __pte(old_pte), __pte(new_pte)));\n> >  \n> >  \trflags = htab_convert_pte_flags(new_pte);\n> > +\trpte = __real_pte(__pte(old_pte), ptep);\n> >  \n> >  \tsz = ((1UL) << shift);\n> >  \tif (!cpu_has_feature(CPU_FTR_COHERENT_ICACHE))\n> > @@ -71,16 +73,11 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,\n> >  \t/* Check if pte already has an hpte (case 2) */\n> >  \tif (unlikely(old_pte & H_PAGE_HASHPTE)) {\n> >  \t\t/* There MIGHT be an HPTE for this pte */\n> > -\t\tunsigned long hash, slot;\n> > +\t\tunsigned long gslot;\n> >  \n> > -\t\thash = hpt_hash(vpn, shift, ssize);\n> > -\t\tif (old_pte & H_PAGE_F_SECOND)\n> > -\t\t\thash = ~hash;\n> > -\t\tslot = (hash & htab_hash_mask) * HPTES_PER_GROUP;\n> > -\t\tslot += (old_pte & H_PAGE_F_GIX) >> H_PAGE_F_GIX_SHIFT;\n> > -\n> > -\t\tif (mmu_hash_ops.hpte_updatepp(slot, rflags, vpn, mmu_psize,\n> > -\t\t\t\t\t       mmu_psize, ssize, flags) == -1)\n> > +\t\tgslot = pte_get_hash_gslot(vpn, shift, ssize, rpte, 0);\n> > +\t\tif (mmu_hash_ops.hpte_updatepp(gslot, rflags, vpn, mmu_psize,\n> > +\t\t\t\tmmu_psize, ssize, flags) == -1)\n> >  \t\t\told_pte &= ~_PAGE_HPTEFLAGS;\n> >  \t}\n> >  \n> > @@ -106,8 +103,7 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,\n> >  \t\t\treturn -1;\n> >  \t\t}\n> >  \n> > -\t\tnew_pte |= (slot << H_PAGE_F_GIX_SHIFT) &\n> > -\t\t\t(H_PAGE_F_SECOND | H_PAGE_F_GIX);\n> > +\t\tnew_pte |= pte_set_hash_slot(ptep, rpte, 0, slot);\n> >  \t}\n> >  \n> >  \t/*\n> \n> Balbir","headers":{"Return-Path":"<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>","X-Original-To":["patchwork-incoming@ozlabs.org","linuxppc-dev@lists.ozlabs.org"],"Delivered-To":["patchwork-incoming@ozlabs.org","linuxppc-dev@lists.ozlabs.org"],"Received":["from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\t(using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3xtR683Shyz9ryv\n\tfor <patchwork-incoming@ozlabs.org>;\n\tFri, 15 Sep 2017 03:56:00 +1000 (AEST)","from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\tby lists.ozlabs.org (Postfix) with ESMTP id 3xtR682gjwzDrYD\n\tfor <patchwork-incoming@ozlabs.org>;\n\tFri, 15 Sep 2017 03:56:00 +1000 (AEST)","from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com\n\t[148.163.156.1])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby lists.ozlabs.org (Postfix) with ESMTPS id 3xtR4J4dsHzDqZ7\n\tfor <linuxppc-dev@lists.ozlabs.org>;\n\tFri, 15 Sep 2017 03:54:24 +1000 (AEST)","from pps.filterd (m0098410.ppops.net [127.0.0.1])\n\tby mx0a-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id\n\tv8EHsJ2e124883\n\tfor <linuxppc-dev@lists.ozlabs.org>; Thu, 14 Sep 2017 13:54:21 -0400","from e38.co.us.ibm.com (e38.co.us.ibm.com [32.97.110.159])\n\tby mx0a-001b2d01.pphosted.com with ESMTP id 2cyu30cwyr-1\n\t(version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT)\n\tfor <linuxppc-dev@lists.ozlabs.org>; Thu, 14 Sep 2017 13:54:20 -0400","from localhost\n\tby e38.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use\n\tOnly! Violators will be prosecuted\n\tfor <linuxppc-dev@lists.ozlabs.org> from <linuxram@us.ibm.com>;\n\tThu, 14 Sep 2017 11:54:17 -0600","from b03cxnp07028.gho.boulder.ibm.com (9.17.130.15)\n\tby e38.co.us.ibm.com (192.168.1.138) with IBM ESMTP SMTP Gateway:\n\tAuthorized Use Only! Violators will be prosecuted; \n\tThu, 14 Sep 2017 11:54:14 -0600","from b03ledav004.gho.boulder.ibm.com\n\t(b03ledav004.gho.boulder.ibm.com [9.17.130.235])\n\tby b03cxnp07028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with\n\tESMTP id v8EHsEWo1114596; Thu, 14 Sep 2017 10:54:14 -0700","from b03ledav004.gho.boulder.ibm.com (unknown [127.0.0.1])\n\tby IMSVA (Postfix) with ESMTP id 001857803F;\n\tThu, 14 Sep 2017 11:54:14 -0600 (MDT)","from ram.oc3035372033.ibm.com (unknown [9.85.179.168])\n\tby b03ledav004.gho.boulder.ibm.com (Postfix) with ESMTPS id\n\tB160778037; Thu, 14 Sep 2017 11:54:11 -0600 (MDT)"],"Authentication-Results":"ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=us.ibm.com\n\t(client-ip=148.163.156.1; helo=mx0a-001b2d01.pphosted.com;\n\tenvelope-from=linuxram@us.ibm.com; receiver=<UNKNOWN>)","Date":"Thu, 14 Sep 2017 10:54:09 -0700","From":"Ram Pai <linuxram@us.ibm.com>","To":"Balbir Singh <bsingharora@gmail.com>","Subject":"Re: [PATCH 4/7] powerpc: Free up four 64K PTE bits in 64K backed\n\tHPTE pages","References":"<1504910713-7094-1-git-send-email-linuxram@us.ibm.com>\n\t<1504910713-7094-5-git-send-email-linuxram@us.ibm.com>\n\t<20170914114449.40446d96@firefly.ozlabs.ibm.com>","MIME-Version":"1.0","Content-Type":"text/plain; charset=us-ascii","Content-Disposition":"inline","In-Reply-To":"<20170914114449.40446d96@firefly.ozlabs.ibm.com>","User-Agent":"Mutt/1.5.20 (2009-12-10)","X-TM-AS-GCONF":"00","x-cbid":"17091417-0028-0000-0000-0000085BDDC2","X-IBM-SpamModules-Scores":"","X-IBM-SpamModules-Versions":"BY=3.00007733; HX=3.00000241; KW=3.00000007;\n\tPH=3.00000004; SC=3.00000227; SDB=6.00917001; UDB=6.00460538;\n\tIPR=6.00697228; \n\tBA=6.00005589; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009;\n\tZB=6.00000000; \n\tZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00017154;\n\tXFM=3.00000015; UTC=2017-09-14 17:54:16","X-IBM-AV-DETECTION":"SAVI=unused REMOTE=unused XFE=unused","x-cbparentid":"17091417-0029-0000-0000-0000378D789D","Message-Id":"<20170914175408.GF5698@ram.oc3035372033.ibm.com>","X-Proofpoint-Virus-Version":"vendor=fsecure engine=2.50.10432:, ,\n\tdefinitions=2017-09-14_05:, , signatures=0","X-Proofpoint-Spam-Details":"rule=outbound_notspam policy=outbound score=0\n\tspamscore=0 suspectscore=2\n\tmalwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam\n\tadjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000\n\tdefinitions=main-1709140271","X-BeenThere":"linuxppc-dev@lists.ozlabs.org","X-Mailman-Version":"2.1.24","Precedence":"list","List-Id":"Linux on PowerPC Developers Mail List\n\t<linuxppc-dev.lists.ozlabs.org>","List-Unsubscribe":"<https://lists.ozlabs.org/options/linuxppc-dev>,\n\t<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=unsubscribe>","List-Archive":"<http://lists.ozlabs.org/pipermail/linuxppc-dev/>","List-Post":"<mailto:linuxppc-dev@lists.ozlabs.org>","List-Help":"<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=help>","List-Subscribe":"<https://lists.ozlabs.org/listinfo/linuxppc-dev>,\n\t<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=subscribe>","Reply-To":"Ram Pai <linuxram@us.ibm.com>","Cc":"ebiederm@xmission.com, mhocko@kernel.org, paulus@samba.org,\n\taneesh.kumar@linux.vnet.ibm.com, bauerman@linux.vnet.ibm.com,\n\tlinuxppc-dev@lists.ozlabs.org, khandual@linux.vnet.ibm.com","Errors-To":"linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org","Sender":"\"Linuxppc-dev\"\n\t<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>"}},{"id":1768769,"web_url":"http://patchwork.ozlabs.org/comment/1768769/","msgid":"<20170914182530.GA5721@ram.oc3035372033.ibm.com>","date":"2017-09-14T18:25:30","subject":"Re: [PATCH 4/7] powerpc: Free up four 64K PTE bits in 64K backed\n\tHPTE pages","submitter":{"id":2667,"url":"http://patchwork.ozlabs.org/api/people/2667/","name":"Ram Pai","email":"linuxram@us.ibm.com"},"content":"On Thu, Sep 14, 2017 at 10:54:08AM -0700, Ram Pai wrote:\n> On Thu, Sep 14, 2017 at 11:44:49AM +1000, Balbir Singh wrote:\n> > On Fri,  8 Sep 2017 15:44:44 -0700\n> > Ram Pai <linuxram@us.ibm.com> wrote:\n> > \n> > > Rearrange 64K PTE bits to  free  up  bits 3, 4, 5  and  6\n> > > in the 64K backed HPTE pages. This along with the earlier\n> > > patch will  entirely free  up the four bits from 64K PTE.\n> > > The bit numbers are  big-endian as defined in the  ISA3.0\n> > > \n> > > This patch  does  the  following change to 64K PTE backed\n> > > by 64K HPTE.\n> > > \n> > > H_PAGE_F_SECOND (S) which  occupied  bit  4  moves to the\n> > > \tsecond part of the pte to bit 60.\n> > > H_PAGE_F_GIX (G,I,X) which  occupied  bit 5, 6 and 7 also\n> > > \tmoves  to  the   second part of the pte to bit 61,\n> > >        \t62, 63, 64 respectively\n> > > \n> > > since bit 7 is now freed up, we move H_PAGE_BUSY (B) from\n> > > bit  9  to  bit  7.\n> > > \n> > > The second part of the PTE will hold\n> > > (H_PAGE_F_SECOND|H_PAGE_F_GIX) at bit 60,61,62,63.\n> > > NOTE: None of the bits in the secondary PTE were not used\n> > > by 64k-HPTE backed PTE.\n> > > \n> > > Before the patch, the 64K HPTE backed 64k PTE format was\n> > > as follows\n> > > \n> > >  0 1 2 3 4  5  6  7  8 9 10...........................63\n> > >  : : : : :  :  :  :  : : :                            :\n> > >  v v v v v  v  v  v  v v v                            v\n> > > \n> > > ,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,\n> > > |x|x|x| |S |G |I |X |x|B| |x|x|................|x|x|x|x| <- primary pte\n> > > '_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'\n> > > | | | | |  |  |  |  | | | | |..................| | | | | <- secondary pte\n> > > '_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'\n> > > \n> > > After the patch, the 64k HPTE backed 64k PTE format is\n> > > as follows\n> > > \n> > >  0 1 2 3 4  5  6  7  8 9 10...........................63\n> > >  : : : : :  :  :  :  : : :                            :\n> > >  v v v v v  v  v  v  v v v                            v\n> > > \n> > > ,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,\n> > > |x|x|x| |  |  |  |B |x| | |x|x|................|.|.|.|.| <- primary pte\n> > > '_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'\n> > > | | | | |  |  |  |  | | | | |..................|S|G|I|X| <- secondary pte\n> > > '_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'\n> > > \n> > > The above PTE changes is applicable to hugetlbpages aswell.\n> > > \n> > > The patch does the following code changes:\n> > > \n> > > a) moves  the  H_PAGE_F_SECOND and  H_PAGE_F_GIX to 4k PTE\n> > > \theader   since it is no more needed b the 64k PTEs.\n> > > b) abstracts  out __real_pte() and __rpte_to_hidx() so the\n> > > \tcaller  need not know the bit location of the slot.\n> > > c) moves the slot bits to the secondary pte.\n> > > \n> > > Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>\n> > > Signed-off-by: Ram Pai <linuxram@us.ibm.com>\n> > > ---\n> > >  arch/powerpc/include/asm/book3s/64/hash-4k.h  |    3 ++\n> > >  arch/powerpc/include/asm/book3s/64/hash-64k.h |   29 +++++++++++-------------\n> > >  arch/powerpc/include/asm/book3s/64/hash.h     |    3 --\n> > >  arch/powerpc/mm/hash64_64k.c                  |   23 ++++++++-----------\n> > >  arch/powerpc/mm/hugetlbpage-hash64.c          |   18 ++++++---------\n> > >  5 files changed, 33 insertions(+), 43 deletions(-)\n> > > \n> > > diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h\n> > > index e66bfeb..dc153c6 100644\n> > > --- a/arch/powerpc/include/asm/book3s/64/hash-4k.h\n> > > +++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h\n> > > @@ -16,6 +16,9 @@\n> > >  #define H_PUD_TABLE_SIZE\t(sizeof(pud_t) << H_PUD_INDEX_SIZE)\n> > >  #define H_PGD_TABLE_SIZE\t(sizeof(pgd_t) << H_PGD_INDEX_SIZE)\n> > >  \n> > > +#define H_PAGE_F_GIX_SHIFT\t56\n> > > +#define H_PAGE_F_SECOND\t_RPAGE_RSV2\t/* HPTE is in 2ndary HPTEG */\n> > > +#define H_PAGE_F_GIX\t(_RPAGE_RSV3 | _RPAGE_RSV4 | _RPAGE_RPN44)\n> > >  #define H_PAGE_BUSY\t_RPAGE_RSV1     /* software: PTE & hash are busy */\n> > >  \n> > >  /* PTE flags to conserve for HPTE identification */\n> > > diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h\n> > > index e038f1c..89ef5a9 100644\n> > > --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h\n> > > +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h\n> > > @@ -12,7 +12,7 @@\n> > >   */\n> > >  #define H_PAGE_COMBO\t_RPAGE_RPN0 /* this is a combo 4k page */\n> > >  #define H_PAGE_4K_PFN\t_RPAGE_RPN1 /* PFN is for a single 4k page */\n> > > -#define H_PAGE_BUSY\t_RPAGE_RPN42     /* software: PTE & hash are busy */\n> > > +#define H_PAGE_BUSY\t_RPAGE_RPN44     /* software: PTE & hash are busy */\n> > >  \n> > >  /*\n> > >   * We need to differentiate between explicit huge page and THP huge\n> > > @@ -21,8 +21,7 @@\n> > >  #define H_PAGE_THP_HUGE  H_PAGE_4K_PFN\n> > >  \n> > >  /* PTE flags to conserve for HPTE identification */\n> > > -#define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_F_SECOND | \\\n> > > -\t\t\t H_PAGE_F_GIX | H_PAGE_HASHPTE | H_PAGE_COMBO)\n> > > +#define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_HASHPTE | H_PAGE_COMBO)\n> > >  /*\n> > >   * we support 16 fragments per PTE page of 64K size.\n> > >   */\n> > > @@ -50,24 +49,22 @@ static inline real_pte_t __real_pte(pte_t pte, pte_t *ptep)\n> > >  \tunsigned long *hidxp;\n> > >  \n> > >  \trpte.pte = pte;\n> > > -\trpte.hidx = 0;\n> > > -\tif (pte_val(pte) & H_PAGE_COMBO) {\n> > > -\t\t/*\n> > > -\t\t * Make sure we order the hidx load against the H_PAGE_COMBO\n> > > -\t\t * check. The store side ordering is done in __hash_page_4K\n> > > -\t\t */\n> > > -\t\tsmp_rmb();\n> > > -\t\thidxp = (unsigned long *)(ptep + PTRS_PER_PTE);\n> > > -\t\trpte.hidx = *hidxp;\n> > > -\t}\n> > > +\t/*\n> > > +\t * Ensure that we do not read the hidx before we read\n> > > +\t * the pte. Because the writer side is  expected\n> > > +\t * to finish writing the hidx first followed by the pte,\n> > > +\t * by using smp_wmb().\n> > > +\t * pte_set_hash_slot() ensures that.\n> > > +\t */\n> > > +\tsmp_rmb();\n> > > +\thidxp = (unsigned long *)(ptep + PTRS_PER_PTE);\n> > > +\trpte.hidx = *hidxp;\n> > >  \treturn rpte;\n> > >  }\n> > >  \n> > >  static inline unsigned long __rpte_to_hidx(real_pte_t rpte, unsigned long index)\n> > >  {\n> > > -\tif ((pte_val(rpte.pte) & H_PAGE_COMBO))\n> > > -\t\treturn (rpte.hidx >> (index<<2)) & 0xf;\n> > > -\treturn (pte_val(rpte.pte) >> H_PAGE_F_GIX_SHIFT) & 0xf;\n> > > +\treturn ((rpte.hidx >> (index<<2)) & 0xfUL);\n> > >  }\n> > >  \n> > >  /*\n> > > diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h\n> > > index 8ce4112..46f3a23 100644\n> > > --- a/arch/powerpc/include/asm/book3s/64/hash.h\n> > > +++ b/arch/powerpc/include/asm/book3s/64/hash.h\n> > > @@ -8,9 +8,6 @@\n> > >   *\n> > >   */\n> > >  #define H_PTE_NONE_MASK\t\t_PAGE_HPTEFLAGS\n> > > -#define H_PAGE_F_GIX_SHIFT\t56\n> > > -#define H_PAGE_F_SECOND\t\t_RPAGE_RSV2\t/* HPTE is in 2ndary HPTEG */\n> > > -#define H_PAGE_F_GIX\t\t(_RPAGE_RSV3 | _RPAGE_RSV4 | _RPAGE_RPN44)\n> > >  #define H_PAGE_HASHPTE\t\t_RPAGE_RPN43\t/* PTE has associated HPTE */\n> > >  \n> > >  #ifdef CONFIG_PPC_64K_PAGES\n> > > diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c\n> > > index c6c5559..9c63844 100644\n> > > --- a/arch/powerpc/mm/hash64_64k.c\n> > > +++ b/arch/powerpc/mm/hash64_64k.c\n> > > @@ -103,8 +103,8 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,\n> > >  \t\t * On hash insert failure we use old pte value and we don't\n> > >  \t\t * want slot information there if we have a insert failure.\n> > >  \t\t */\n> > > -\t\told_pte &= ~(H_PAGE_HASHPTE | H_PAGE_F_GIX | H_PAGE_F_SECOND);\n> > > -\t\tnew_pte &= ~(H_PAGE_HASHPTE | H_PAGE_F_GIX | H_PAGE_F_SECOND);\n> > > +\t\told_pte &= ~H_PAGE_HASHPTE;\n> > > +\t\tnew_pte &= ~H_PAGE_HASHPTE;\n> > \n> > Shouldn't we set old/new_pte.slot = invalid? via rpte.hidx\n> \n> by resetting the H_PAGE_HASHPTE flag, we are invalidating\n> slot information.  Would that not be sufficient?\n\nI think i misunderstood you question. Yes rpte.hidx will have\nto be reset to invalid. The code does that further down in that\nfunction.\n\n\tif (!(old_pte & H_PAGE_COMBO))\n\t\trpte.hidx = ~0x0UL;\n\n\nRP","headers":{"Return-Path":"<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>","X-Original-To":["patchwork-incoming@ozlabs.org","linuxppc-dev@lists.ozlabs.org"],"Delivered-To":["patchwork-incoming@ozlabs.org","linuxppc-dev@lists.ozlabs.org"],"Received":["from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\t(using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3xtRqY0Kv5z9s82\n\tfor <patchwork-incoming@ozlabs.org>;\n\tFri, 15 Sep 2017 04:28:25 +1000 (AEST)","from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\tby lists.ozlabs.org (Postfix) with ESMTP id 3xtRqX68fMzDrCv\n\tfor <patchwork-incoming@ozlabs.org>;\n\tFri, 15 Sep 2017 04:28:24 +1000 (AEST)","from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com\n\t[148.163.158.5])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby lists.ozlabs.org (Postfix) with ESMTPS id 3xtRmS1nQ8zDqr5\n\tfor <linuxppc-dev@lists.ozlabs.org>;\n\tFri, 15 Sep 2017 04:25:44 +1000 (AEST)","from pps.filterd (m0098416.ppops.net [127.0.0.1])\n\tby mx0b-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id\n\tv8EIL8OV024536\n\tfor <linuxppc-dev@lists.ozlabs.org>; Thu, 14 Sep 2017 14:25:38 -0400","from e17.ny.us.ibm.com (e17.ny.us.ibm.com [129.33.205.207])\n\tby mx0b-001b2d01.pphosted.com with ESMTP id 2cyur532x7-1\n\t(version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT)\n\tfor <linuxppc-dev@lists.ozlabs.org>; Thu, 14 Sep 2017 14:25:37 -0400","from localhost\n\tby e17.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use\n\tOnly! Violators will be prosecuted\n\tfor <linuxppc-dev@lists.ozlabs.org> from <linuxram@us.ibm.com>;\n\tThu, 14 Sep 2017 14:25:37 -0400","from b01cxnp22035.gho.pok.ibm.com (9.57.198.25)\n\tby e17.ny.us.ibm.com (146.89.104.204) with IBM ESMTP SMTP Gateway:\n\tAuthorized Use Only! Violators will be prosecuted; \n\tThu, 14 Sep 2017 14:25:34 -0400","from b01ledav006.gho.pok.ibm.com (b01ledav006.gho.pok.ibm.com\n\t[9.57.199.111])\n\tby b01cxnp22035.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP\n\tid v8EIPXT550462752; Thu, 14 Sep 2017 18:25:33 GMT","from b01ledav006.gho.pok.ibm.com (unknown [127.0.0.1])\n\tby IMSVA (Postfix) with ESMTP id B6F9EAC03F;\n\tThu, 14 Sep 2017 14:26:03 -0400 (EDT)","from ram.oc3035372033.ibm.com (unknown [9.85.179.168])\n\tby b01ledav006.gho.pok.ibm.com (Postfix) with ESMTPS id 85CCCAC03A;\n\tThu, 14 Sep 2017 14:26:02 -0400 (EDT)"],"Authentication-Results":"ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=us.ibm.com\n\t(client-ip=148.163.158.5; helo=mx0a-001b2d01.pphosted.com;\n\tenvelope-from=linuxram@us.ibm.com; receiver=<UNKNOWN>)","Date":"Thu, 14 Sep 2017 11:25:30 -0700","From":"Ram Pai <linuxram@us.ibm.com>","To":"Balbir Singh <bsingharora@gmail.com>","Subject":"Re: [PATCH 4/7] powerpc: Free up four 64K PTE bits in 64K backed\n\tHPTE pages","References":"<1504910713-7094-1-git-send-email-linuxram@us.ibm.com>\n\t<1504910713-7094-5-git-send-email-linuxram@us.ibm.com>\n\t<20170914114449.40446d96@firefly.ozlabs.ibm.com>\n\t<20170914175408.GF5698@ram.oc3035372033.ibm.com>","MIME-Version":"1.0","Content-Type":"text/plain; charset=us-ascii","Content-Disposition":"inline","In-Reply-To":"<20170914175408.GF5698@ram.oc3035372033.ibm.com>","User-Agent":"Mutt/1.5.20 (2009-12-10)","X-TM-AS-GCONF":"00","x-cbid":"17091418-0040-0000-0000-000003A1F71C","X-IBM-SpamModules-Scores":"","X-IBM-SpamModules-Versions":"BY=3.00007734; HX=3.00000241; KW=3.00000007;\n\tPH=3.00000004; SC=3.00000227; SDB=6.00917011; UDB=6.00460544;\n\tIPR=6.00697239; \n\tBA=6.00005589; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009;\n\tZB=6.00000000; \n\tZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00017154;\n\tXFM=3.00000015; UTC=2017-09-14 18:25:36","X-IBM-AV-DETECTION":"SAVI=unused REMOTE=unused XFE=unused","x-cbparentid":"17091418-0041-0000-0000-00000796F924","Message-Id":"<20170914182530.GA5721@ram.oc3035372033.ibm.com>","X-Proofpoint-Virus-Version":"vendor=fsecure engine=2.50.10432:, ,\n\tdefinitions=2017-09-14_05:, , signatures=0","X-Proofpoint-Spam-Details":"rule=outbound_notspam policy=outbound score=0\n\tspamscore=0 suspectscore=2\n\tmalwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam\n\tadjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000\n\tdefinitions=main-1709140277","X-BeenThere":"linuxppc-dev@lists.ozlabs.org","X-Mailman-Version":"2.1.24","Precedence":"list","List-Id":"Linux on PowerPC Developers Mail List\n\t<linuxppc-dev.lists.ozlabs.org>","List-Unsubscribe":"<https://lists.ozlabs.org/options/linuxppc-dev>,\n\t<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=unsubscribe>","List-Archive":"<http://lists.ozlabs.org/pipermail/linuxppc-dev/>","List-Post":"<mailto:linuxppc-dev@lists.ozlabs.org>","List-Help":"<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=help>","List-Subscribe":"<https://lists.ozlabs.org/listinfo/linuxppc-dev>,\n\t<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=subscribe>","Reply-To":"Ram Pai <linuxram@us.ibm.com>","Cc":"ebiederm@xmission.com, mhocko@kernel.org, paulus@samba.org,\n\taneesh.kumar@linux.vnet.ibm.com, bauerman@linux.vnet.ibm.com,\n\tlinuxppc-dev@lists.ozlabs.org, khandual@linux.vnet.ibm.com","Errors-To":"linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org","Sender":"\"Linuxppc-dev\"\n\t<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>"}},{"id":1792483,"web_url":"http://patchwork.ozlabs.org/comment/1792483/","msgid":"<87y3o28cv7.fsf@linux.vnet.ibm.com>","date":"2017-10-23T08:52:44","subject":"Re: [PATCH 4/7] powerpc: Free up four 64K PTE bits in 64K backed\n\tHPTE pages","submitter":{"id":664,"url":"http://patchwork.ozlabs.org/api/people/664/","name":"Aneesh Kumar K.V","email":"aneesh.kumar@linux.vnet.ibm.com"},"content":"Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:\n\n> On Fri, 2017-09-08 at 15:44 -0700, Ram Pai wrote:\n>> The second part of the PTE will hold\n>> (H_PAGE_F_SECOND|H_PAGE_F_GIX) at bit 60,61,62,63.\n>> NOTE: None of the bits in the secondary PTE were not used\n>> by 64k-HPTE backed PTE.\n>\n> Have you measured the performance impact of this ? The second part of\n> the PTE being in a different cache line there could be one...\n>\n\nI am also looking at a patch series removing the slot tracking\ncompletely. With randomize address turned off and no swap in guest/host\nand making sure we touched most of guest ram, I don't find much impact\nin performance when we don't track the slot at all. I will post the\npatch series with numbers in a day or two. But my test was\n\nwhile (5000) {\n      mmap(128M)\n      touch every page of 2048 pages\n      munmap()\n}\n\nI could also be the best case in my run because i might have always\nfound the hash pte slot in the primary. In one measurement with swap on\nand address randmization enabled, i did find a 50% impact. But then i\nwas not able to recreate that again. So could be something i did wrong\nin the test setup.\n\nRam,\n\nWill you be able to get a test run with the above loop?\n\n-aneesh","headers":{"Return-Path":"<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>","X-Original-To":["patchwork-incoming@ozlabs.org","linuxppc-dev@lists.ozlabs.org"],"Delivered-To":["patchwork-incoming@ozlabs.org","linuxppc-dev@lists.ozlabs.org"],"Received":["from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\t(using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3yLCmm5WB3z9t5C\n\tfor <patchwork-incoming@ozlabs.org>;\n\tMon, 23 Oct 2017 21:48:24 +1100 (AEDT)","from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\tby lists.ozlabs.org (Postfix) with ESMTP id 3yLCmm4YBwzDqyC\n\tfor <patchwork-incoming@ozlabs.org>;\n\tMon, 23 Oct 2017 21:48:24 +1100 (AEDT)","from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com\n\t[148.163.158.5])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby lists.ozlabs.org (Postfix) with ESMTPS id 3yLClN5jd8zDqhf\n\tfor <linuxppc-dev@lists.ozlabs.org>;\n\tMon, 23 Oct 2017 21:47:12 +1100 (AEDT)","from pps.filterd (m0098416.ppops.net [127.0.0.1])\n\tby mx0b-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id\n\tv9NAiQTA137843\n\tfor <linuxppc-dev@lists.ozlabs.org>; Mon, 23 Oct 2017 06:47:10 -0400","from e06smtp13.uk.ibm.com (e06smtp13.uk.ibm.com [195.75.94.109])\n\tby mx0b-001b2d01.pphosted.com with ESMTP id 2dsa7nkun6-1\n\t(version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT)\n\tfor <linuxppc-dev@lists.ozlabs.org>; Mon, 23 Oct 2017 06:47:09 -0400","from localhost\n\tby e06smtp13.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use\n\tOnly! Violators will be prosecuted\n\tfor <linuxppc-dev@lists.ozlabs.org> from\n\t<aneesh.kumar@linux.vnet.ibm.com>; Mon, 23 Oct 2017 11:47:08 +0100","from b06cxnps3075.portsmouth.uk.ibm.com (9.149.109.195)\n\tby e06smtp13.uk.ibm.com (192.168.101.143) with IBM ESMTP SMTP\n\tGateway: Authorized Use Only! Violators will be prosecuted; \n\tMon, 23 Oct 2017 11:47:06 +0100","from d23av01.au.ibm.com (d23av01.au.ibm.com [9.190.234.96])\n\tby b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with\n\tESMTP id v9NAl4QP31326390\n\tfor <linuxppc-dev@lists.ozlabs.org>; Mon, 23 Oct 2017 10:47:05 GMT","from d23av01.au.ibm.com (localhost [127.0.0.1])\n\tby d23av01.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id\n\tv9NAl42m009057\n\tfor <linuxppc-dev@lists.ozlabs.org>; Mon, 23 Oct 2017 21:47:05 +1100","from skywalker ([9.85.198.75])\n\tby d23av01.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with SMTP id\n\tv9NAktGv008888; Mon, 23 Oct 2017 21:46:58 +1100","(nullmailer pid 7017 invoked by uid 1000);\n\tMon, 23 Oct 2017 08:52:45 -0000"],"Authentication-Results":"ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=linux.vnet.ibm.com\n\t(client-ip=148.163.158.5; helo=mx0a-001b2d01.pphosted.com;\n\tenvelope-from=aneesh.kumar@linux.vnet.ibm.com; receiver=<UNKNOWN>)","From":"\"Aneesh Kumar K.V\" <aneesh.kumar@linux.vnet.ibm.com>","To":"Benjamin Herrenschmidt <benh@kernel.crashing.org>,\n\tRam Pai <linuxram@us.ibm.com>, mpe@ellerman.id.au,\n\tlinuxppc-dev@lists.ozlabs.org","Subject":"Re: [PATCH 4/7] powerpc: Free up four 64K PTE bits in 64K backed\n\tHPTE pages","In-Reply-To":"<1505376837.12628.192.camel@kernel.crashing.org>","References":"<1504910713-7094-1-git-send-email-linuxram@us.ibm.com>\n\t<1504910713-7094-5-git-send-email-linuxram@us.ibm.com>\n\t<1505376837.12628.192.camel@kernel.crashing.org>","Date":"Mon, 23 Oct 2017 14:22:44 +0530","MIME-Version":"1.0","Content-Type":"text/plain","X-TM-AS-MML":"disable","x-cbid":"17102310-0012-0000-0000-00000584DFBC","X-IBM-AV-DETECTION":"SAVI=unused REMOTE=unused XFE=unused","x-cbparentid":"17102310-0013-0000-0000-000018FF4F08","Message-Id":"<87y3o28cv7.fsf@linux.vnet.ibm.com>","X-Proofpoint-Virus-Version":"vendor=fsecure engine=2.50.10432:, ,\n\tdefinitions=2017-10-23_03:, , signatures=0","X-Proofpoint-Spam-Details":"rule=outbound_notspam policy=outbound score=0\n\tspamscore=0 suspectscore=1\n\tmalwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam\n\tadjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000\n\tdefinitions=main-1710230155","X-BeenThere":"linuxppc-dev@lists.ozlabs.org","X-Mailman-Version":"2.1.24","Precedence":"list","List-Id":"Linux on PowerPC Developers Mail List\n\t<linuxppc-dev.lists.ozlabs.org>","List-Unsubscribe":"<https://lists.ozlabs.org/options/linuxppc-dev>,\n\t<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=unsubscribe>","List-Archive":"<http://lists.ozlabs.org/pipermail/linuxppc-dev/>","List-Post":"<mailto:linuxppc-dev@lists.ozlabs.org>","List-Help":"<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=help>","List-Subscribe":"<https://lists.ozlabs.org/listinfo/linuxppc-dev>,\n\t<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=subscribe>","Cc":"mhocko@kernel.org, paulus@samba.org, ebiederm@xmission.com,\n\tbauerman@linux.vnet.ibm.com, khandual@linux.vnet.ibm.com","Errors-To":"linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org","Sender":"\"Linuxppc-dev\"\n\t<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>"}},{"id":1792755,"web_url":"http://patchwork.ozlabs.org/comment/1792755/","msgid":"<20171023192236.GA5488@ram.oc3035372033.ibm.com>","date":"2017-10-23T19:22:36","subject":"Re: [PATCH 4/7] powerpc: Free up four 64K PTE bits in 64K backed\n\tHPTE pages","submitter":{"id":2667,"url":"http://patchwork.ozlabs.org/api/people/2667/","name":"Ram Pai","email":"linuxram@us.ibm.com"},"content":"On Thu, Sep 14, 2017 at 06:13:57PM +1000, Benjamin Herrenschmidt wrote:\n> On Fri, 2017-09-08 at 15:44 -0700, Ram Pai wrote:\n> > The second part of the PTE will hold\n> > (H_PAGE_F_SECOND|H_PAGE_F_GIX) at bit 60,61,62,63.\n> > NOTE: None of the bits in the secondary PTE were not used\n> > by 64k-HPTE backed PTE.\n> \n> Have you measured the performance impact of this ? The second part of\n> the PTE being in a different cache line there could be one...\n\nhmm..missed responding to this comment.\n\nI did a preliminay measurement running mmap bench in the selftest.\nRan it multiple times. almost always the numbers were either equal-to\nor better-than without the patch-series.\n\nRP","headers":{"Return-Path":"<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>","X-Original-To":["patchwork-incoming@ozlabs.org","linuxppc-dev@lists.ozlabs.org"],"Delivered-To":["patchwork-incoming@ozlabs.org","linuxppc-dev@lists.ozlabs.org"],"Received":["from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68])\n\t(using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3yLRCd3pNkz9s81\n\tfor <patchwork-incoming@ozlabs.org>;\n\tTue, 24 Oct 2017 06:23:57 +1100 (AEDT)","from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\tby lists.ozlabs.org (Postfix) with ESMTP id 3yLRCd2nZkzDql4\n\tfor <patchwork-incoming@ozlabs.org>;\n\tTue, 24 Oct 2017 06:23:57 +1100 (AEDT)","from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com\n\t[148.163.156.1])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby lists.ozlabs.org (Postfix) with ESMTPS id 3yLRBH6K6MzDqfh\n\tfor <linuxppc-dev@lists.ozlabs.org>;\n\tTue, 24 Oct 2017 06:22:47 +1100 (AEDT)","from pps.filterd (m0098393.ppops.net [127.0.0.1])\n\tby mx0a-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id\n\tv9NJL1bG064737\n\tfor <linuxppc-dev@lists.ozlabs.org>; Mon, 23 Oct 2017 15:22:45 -0400","from e34.co.us.ibm.com (e34.co.us.ibm.com [32.97.110.152])\n\tby mx0a-001b2d01.pphosted.com with ESMTP id 2dsj2tmf90-1\n\t(version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT)\n\tfor <linuxppc-dev@lists.ozlabs.org>; Mon, 23 Oct 2017 15:22:45 -0400","from localhost\n\tby e34.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use\n\tOnly! Violators will be prosecuted\n\tfor <linuxppc-dev@lists.ozlabs.org> from <linuxram@us.ibm.com>;\n\tMon, 23 Oct 2017 13:22:44 -0600","from b03cxnp08026.gho.boulder.ibm.com (9.17.130.18)\n\tby e34.co.us.ibm.com (192.168.1.134) with IBM ESMTP SMTP Gateway:\n\tAuthorized Use Only! Violators will be prosecuted; \n\tMon, 23 Oct 2017 13:22:41 -0600","from b03ledav004.gho.boulder.ibm.com\n\t(b03ledav004.gho.boulder.ibm.com [9.17.130.235])\n\tby b03cxnp08026.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with\n\tESMTP id v9NJMfxM63176946; Mon, 23 Oct 2017 12:22:41 -0700","from b03ledav004.gho.boulder.ibm.com (unknown [127.0.0.1])\n\tby IMSVA (Postfix) with ESMTP id 5C32E78038;\n\tMon, 23 Oct 2017 13:22:41 -0600 (MDT)","from ram.oc3035372033.ibm.com (unknown [9.85.182.80])\n\tby b03ledav004.gho.boulder.ibm.com (Postfix) with ESMTPS id\n\t7643F78037; Mon, 23 Oct 2017 13:22:39 -0600 (MDT)"],"Authentication-Results":"ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=us.ibm.com\n\t(client-ip=148.163.156.1; helo=mx0a-001b2d01.pphosted.com;\n\tenvelope-from=linuxram@us.ibm.com; receiver=<UNKNOWN>)","Date":"Mon, 23 Oct 2017 12:22:36 -0700","From":"Ram Pai <linuxram@us.ibm.com>","To":"Benjamin Herrenschmidt <benh@kernel.crashing.org>","Subject":"Re: [PATCH 4/7] powerpc: Free up four 64K PTE bits in 64K backed\n\tHPTE pages","References":"<1504910713-7094-1-git-send-email-linuxram@us.ibm.com>\n\t<1504910713-7094-5-git-send-email-linuxram@us.ibm.com>\n\t<1505376837.12628.192.camel@kernel.crashing.org>","MIME-Version":"1.0","Content-Type":"text/plain; charset=us-ascii","Content-Disposition":"inline","In-Reply-To":"<1505376837.12628.192.camel@kernel.crashing.org>","User-Agent":"Mutt/1.5.20 (2009-12-10)","X-TM-AS-GCONF":"00","x-cbid":"17102319-0016-0000-0000-000007B2972D","X-IBM-SpamModules-Scores":"","X-IBM-SpamModules-Versions":"BY=3.00007940; HX=3.00000241; KW=3.00000007;\n\tPH=3.00000004; SC=3.00000239; SDB=6.00935419; UDB=6.00471278;\n\tIPR=6.00715685; \n\tBA=6.00005656; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009;\n\tZB=6.00000000; \n\tZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00017674;\n\tXFM=3.00000015; UTC=2017-10-23 19:22:43","X-IBM-AV-DETECTION":"SAVI=unused REMOTE=unused XFE=unused","x-cbparentid":"17102319-0017-0000-0000-00003BF904DF","Message-Id":"<20171023192236.GA5488@ram.oc3035372033.ibm.com>","X-Proofpoint-Virus-Version":"vendor=fsecure engine=2.50.10432:, ,\n\tdefinitions=2017-10-23_09:, , signatures=0","X-Proofpoint-Spam-Details":"rule=outbound_notspam policy=outbound score=0\n\tspamscore=0 suspectscore=0\n\tmalwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam\n\tadjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000\n\tdefinitions=main-1710230272","X-BeenThere":"linuxppc-dev@lists.ozlabs.org","X-Mailman-Version":"2.1.24","Precedence":"list","List-Id":"Linux on PowerPC Developers Mail List\n\t<linuxppc-dev.lists.ozlabs.org>","List-Unsubscribe":"<https://lists.ozlabs.org/options/linuxppc-dev>,\n\t<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=unsubscribe>","List-Archive":"<http://lists.ozlabs.org/pipermail/linuxppc-dev/>","List-Post":"<mailto:linuxppc-dev@lists.ozlabs.org>","List-Help":"<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=help>","List-Subscribe":"<https://lists.ozlabs.org/listinfo/linuxppc-dev>,\n\t<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=subscribe>","Reply-To":"Ram Pai <linuxram@us.ibm.com>","Cc":"mhocko@kernel.org, paulus@samba.org, aneesh.kumar@linux.vnet.ibm.com,\n\tbauerman@linux.vnet.ibm.com, khandual@linux.vnet.ibm.com,\n\tlinuxppc-dev@lists.ozlabs.org, ebiederm@xmission.com","Errors-To":"linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org","Sender":"\"Linuxppc-dev\"\n\t<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>"}},{"id":1792894,"web_url":"http://patchwork.ozlabs.org/comment/1792894/","msgid":"<20171023234246.GA5485@ram.oc3035372033.ibm.com>","date":"2017-10-23T23:42:46","subject":"Re: [PATCH 4/7] powerpc: Free up four 64K PTE bits in 64K backed\n\tHPTE pages","submitter":{"id":2667,"url":"http://patchwork.ozlabs.org/api/people/2667/","name":"Ram Pai","email":"linuxram@us.ibm.com"},"content":"On Mon, Oct 23, 2017 at 02:22:44PM +0530, Aneesh Kumar K.V wrote:\n> Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:\n> \n> > On Fri, 2017-09-08 at 15:44 -0700, Ram Pai wrote:\n> >> The second part of the PTE will hold\n> >> (H_PAGE_F_SECOND|H_PAGE_F_GIX) at bit 60,61,62,63.\n> >> NOTE: None of the bits in the secondary PTE were not used\n> >> by 64k-HPTE backed PTE.\n> >\n> > Have you measured the performance impact of this ? The second part of\n> > the PTE being in a different cache line there could be one...\n> >\n> \n> I am also looking at a patch series removing the slot tracking\n> completely. With randomize address turned off and no swap in guest/host\n> and making sure we touched most of guest ram, I don't find much impact\n> in performance when we don't track the slot at all. I will post the\n> patch series with numbers in a day or two. But my test was\n> \n> while (5000) {\n>       mmap(128M)\n>       touch every page of 2048 pages\n>       munmap()\n> }\n> \n> I could also be the best case in my run because i might have always\n> found the hash pte slot in the primary. In one measurement with swap on\n> and address randmization enabled, i did find a 50% impact. But then i\n> was not able to recreate that again. So could be something i did wrong\n> in the test setup.\n> \n> Ram,\n> \n> Will you be able to get a test run with the above loop?\n\nYes. results with patch look good; better than w/o patch.\n\n\n/-----------------------------------------------\\\n|Itteratn| secs w/ patch\t|secs w/o patch |\n-------------------------------------------------\n|1\t | 45.572621     \t| 49.046994\t|\n|2\t | 46.049545     \t| 49.378756\t|\n|3\t | 46.103657     \t| 49.223591\t|\n|4\t | 46.298903     \t| 48.991245\t|\n|5\t | 46.353202     \t| 48.988033\t|\n|6\t | 45.440878     \t| 49.175846\t|\n|7\t | 46.860373     \t| 49.008395\t|\n|8\t | 46.221390     \t| 49.236964\t|\n|9\t | 45.794993     \t| 49.171927\t|\n|10\t | 46.569491     \t| 48.995628\t|\n|-----------------------------------------------|\n|average  | 46.1265053\t\t| 49.1217379    |\n\\-----------------------------------------------/\n\n\nThe code is as follows:\n\n\ndiff --git a/tools/testing/selftests/powerpc/benchmarks/mmap_bench.c b/tools/testing/selftests/powerpc/benchmarks/mmap_bench.c\nindex 8d084a2..ef2ad87 100644\n--- a/tools/testing/selftests/powerpc/benchmarks/mmap_bench.c\n+++ b/tools/testing/selftests/powerpc/benchmarks/mmap_bench.c\n@@ -10,14 +10,14 @@\n \n #include \"utils.h\"\n \n-#define ITERATIONS 5000000\n+#define ITERATIONS 5000\n \n #define MEMSIZE (128 * 1024 * 1024)\n \n int test_mmap(void)\n {\n \tstruct timespec ts_start, ts_end;\n-\tunsigned long i = ITERATIONS;\n+\tunsigned long i = ITERATIONS, j;\n \n \tclock_gettime(CLOCK_MONOTONIC, &ts_start);\n \n@@ -25,6 +25,10 @@ int test_mmap(void)\n \t\tchar *c = mmap(NULL, MEMSIZE, PROT_READ|PROT_WRITE,\n \t\t\t       MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);\n \t\tFAIL_IF(c == MAP_FAILED);\n+\n+\t\tfor (j=0; j < (MEMSIZE >> 16); j++)\n+\t\t\tc[j<<16] = 0xf;\n+\n \t\tmunmap(c, MEMSIZE);\n \t}","headers":{"Return-Path":"<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>","X-Original-To":["patchwork-incoming@ozlabs.org","linuxppc-dev@lists.ozlabs.org"],"Delivered-To":["patchwork-incoming@ozlabs.org","linuxppc-dev@lists.ozlabs.org"],"Received":["from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\t(using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3yLY0B1zknz9sNw\n\tfor <patchwork-incoming@ozlabs.org>;\n\tTue, 24 Oct 2017 10:44:26 +1100 (AEDT)","from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\tby lists.ozlabs.org (Postfix) with ESMTP id 3yLY0B14mqzDqmt\n\tfor <patchwork-incoming@ozlabs.org>;\n\tTue, 24 Oct 2017 10:44:26 +1100 (AEDT)","from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com\n\t[148.163.156.1])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby lists.ozlabs.org (Postfix) with ESMTPS id 3yLXyT0TfXzDqkJ\n\tfor <linuxppc-dev@lists.ozlabs.org>;\n\tTue, 24 Oct 2017 10:42:56 +1100 (AEDT)","from pps.filterd (m0098394.ppops.net [127.0.0.1])\n\tby mx0a-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id\n\tv9NNfdlM005328\n\tfor <linuxppc-dev@lists.ozlabs.org>; Mon, 23 Oct 2017 19:42:54 -0400","from e34.co.us.ibm.com (e34.co.us.ibm.com [32.97.110.152])\n\tby mx0a-001b2d01.pphosted.com with ESMTP id 2dst4p0ckn-1\n\t(version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT)\n\tfor <linuxppc-dev@lists.ozlabs.org>; Mon, 23 Oct 2017 19:42:54 -0400","from localhost\n\tby e34.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use\n\tOnly! Violators will be prosecuted\n\tfor <linuxppc-dev@lists.ozlabs.org> from <linuxram@us.ibm.com>;\n\tMon, 23 Oct 2017 17:42:53 -0600","from b03cxnp08027.gho.boulder.ibm.com (9.17.130.19)\n\tby e34.co.us.ibm.com (192.168.1.134) with IBM ESMTP SMTP Gateway:\n\tAuthorized Use Only! Violators will be prosecuted; \n\tMon, 23 Oct 2017 17:42:51 -0600","from b03ledav006.gho.boulder.ibm.com\n\t(b03ledav006.gho.boulder.ibm.com [9.17.130.237])\n\tby b03cxnp08027.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with\n\tESMTP id v9NNgo1G65732618; Mon, 23 Oct 2017 16:42:50 -0700","from b03ledav006.gho.boulder.ibm.com (unknown [127.0.0.1])\n\tby IMSVA (Postfix) with ESMTP id A7FF0C603C;\n\tMon, 23 Oct 2017 17:42:50 -0600 (MDT)","from ram.oc3035372033.ibm.com (unknown [9.85.182.80])\n\tby b03ledav006.gho.boulder.ibm.com (Postfix) with ESMTPS id\n\t1036BC6037; Mon, 23 Oct 2017 17:42:48 -0600 (MDT)"],"Authentication-Results":"ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=us.ibm.com\n\t(client-ip=148.163.156.1; helo=mx0a-001b2d01.pphosted.com;\n\tenvelope-from=linuxram@us.ibm.com; receiver=<UNKNOWN>)","Date":"Mon, 23 Oct 2017 16:42:46 -0700","From":"Ram Pai <linuxram@us.ibm.com>","To":"\"Aneesh Kumar K.V\" <aneesh.kumar@linux.vnet.ibm.com>","Subject":"Re: [PATCH 4/7] powerpc: Free up four 64K PTE bits in 64K backed\n\tHPTE pages","References":"<1504910713-7094-1-git-send-email-linuxram@us.ibm.com>\n\t<1504910713-7094-5-git-send-email-linuxram@us.ibm.com>\n\t<1505376837.12628.192.camel@kernel.crashing.org>\n\t<87y3o28cv7.fsf@linux.vnet.ibm.com>","MIME-Version":"1.0","Content-Type":"text/plain; charset=us-ascii","Content-Disposition":"inline","In-Reply-To":"<87y3o28cv7.fsf@linux.vnet.ibm.com>","User-Agent":"Mutt/1.5.20 (2009-12-10)","X-TM-AS-GCONF":"00","x-cbid":"17102323-0016-0000-0000-000007B2CB14","X-IBM-SpamModules-Scores":"","X-IBM-SpamModules-Versions":"BY=3.00007941; HX=3.00000241; KW=3.00000007;\n\tPH=3.00000004; SC=3.00000239; SDB=6.00935506; UDB=6.00471330;\n\tIPR=6.00715772; \n\tBA=6.00005656; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009;\n\tZB=6.00000000; \n\tZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00017678;\n\tXFM=3.00000015; UTC=2017-10-23 23:42:53","X-IBM-AV-DETECTION":"SAVI=unused REMOTE=unused XFE=unused","x-cbparentid":"17102323-0017-0000-0000-00003BF98F1F","Message-Id":"<20171023234246.GA5485@ram.oc3035372033.ibm.com>","X-Proofpoint-Virus-Version":"vendor=fsecure engine=2.50.10432:, ,\n\tdefinitions=2017-10-23_12:, , signatures=0","X-Proofpoint-Spam-Details":"rule=outbound_notspam policy=outbound score=0\n\tspamscore=0 suspectscore=0\n\tmalwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam\n\tadjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000\n\tdefinitions=main-1710230333","X-BeenThere":"linuxppc-dev@lists.ozlabs.org","X-Mailman-Version":"2.1.24","Precedence":"list","List-Id":"Linux on PowerPC Developers Mail List\n\t<linuxppc-dev.lists.ozlabs.org>","List-Unsubscribe":"<https://lists.ozlabs.org/options/linuxppc-dev>,\n\t<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=unsubscribe>","List-Archive":"<http://lists.ozlabs.org/pipermail/linuxppc-dev/>","List-Post":"<mailto:linuxppc-dev@lists.ozlabs.org>","List-Help":"<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=help>","List-Subscribe":"<https://lists.ozlabs.org/listinfo/linuxppc-dev>,\n\t<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=subscribe>","Reply-To":"Ram Pai <linuxram@us.ibm.com>","Cc":"mhocko@kernel.org, paulus@samba.org, ebiederm@xmission.com,\n\tbauerman@linux.vnet.ibm.com, linuxppc-dev@lists.ozlabs.org,\n\tkhandual@linux.vnet.ibm.com","Errors-To":"linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org","Sender":"\"Linuxppc-dev\"\n\t<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>"}},{"id":1792953,"web_url":"http://patchwork.ozlabs.org/comment/1792953/","msgid":"<2ca10f6f-972f-b7f9-fcce-81becad58c0c@linux.vnet.ibm.com>","date":"2017-10-24T03:37:36","subject":"Re: [PATCH 4/7] powerpc: Free up four 64K PTE bits in 64K backed\n\tHPTE pages","submitter":{"id":664,"url":"http://patchwork.ozlabs.org/api/people/664/","name":"Aneesh Kumar K.V","email":"aneesh.kumar@linux.vnet.ibm.com"},"content":"On 10/24/2017 12:52 AM, Ram Pai wrote:\n> On Thu, Sep 14, 2017 at 06:13:57PM +1000, Benjamin Herrenschmidt wrote:\n>> On Fri, 2017-09-08 at 15:44 -0700, Ram Pai wrote:\n>>> The second part of the PTE will hold\n>>> (H_PAGE_F_SECOND|H_PAGE_F_GIX) at bit 60,61,62,63.\n>>> NOTE: None of the bits in the secondary PTE were not used\n>>> by 64k-HPTE backed PTE.\n>>\n>> Have you measured the performance impact of this ? The second part of\n>> the PTE being in a different cache line there could be one...\n> \n> hmm..missed responding to this comment.\n> \n> I did a preliminay measurement running mmap bench in the selftest.\n> Ran it multiple times. almost always the numbers were either equal-to\n> or better-than without the patch-series.\n\nmmap bench doesn't do any fault. It is just mmap/munmap in loop.\n\n-aneesh","headers":{"Return-Path":"<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>","X-Original-To":["patchwork-incoming@ozlabs.org","linuxppc-dev@lists.ozlabs.org"],"Delivered-To":["patchwork-incoming@ozlabs.org","linuxppc-dev@lists.ozlabs.org"],"Received":["from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68])\n\t(using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3yLfBz0JSCz9sNV\n\tfor <patchwork-incoming@ozlabs.org>;\n\tTue, 24 Oct 2017 14:39:07 +1100 (AEDT)","from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\tby lists.ozlabs.org (Postfix) with ESMTP id 3yLfBy6bfYzDqts\n\tfor <patchwork-incoming@ozlabs.org>;\n\tTue, 24 Oct 2017 14:39:06 +1100 (AEDT)","from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com\n\t[148.163.158.5])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby lists.ozlabs.org (Postfix) with ESMTPS id 3yLf9T73Q5zDqhg\n\tfor <linuxppc-dev@lists.ozlabs.org>;\n\tTue, 24 Oct 2017 14:37:49 +1100 (AEDT)","from pps.filterd (m0098419.ppops.net [127.0.0.1])\n\tby mx0b-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id\n\tv9O3Xt2b050744\n\tfor <linuxppc-dev@lists.ozlabs.org>; Mon, 23 Oct 2017 23:37:46 -0400","from e14.ny.us.ibm.com (e14.ny.us.ibm.com [129.33.205.204])\n\tby mx0b-001b2d01.pphosted.com with ESMTP id 2dspxr6r9b-1\n\t(version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT)\n\tfor <linuxppc-dev@lists.ozlabs.org>; Mon, 23 Oct 2017 23:37:46 -0400","from localhost\n\tby e14.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use\n\tOnly! Violators will be prosecuted\n\tfor <linuxppc-dev@lists.ozlabs.org> from\n\t<aneesh.kumar@linux.vnet.ibm.com>; Mon, 23 Oct 2017 23:37:45 -0400","from b01cxnp22034.gho.pok.ibm.com (9.57.198.24)\n\tby e14.ny.us.ibm.com (146.89.104.201) with IBM ESMTP SMTP Gateway:\n\tAuthorized Use Only! Violators will be prosecuted; \n\tMon, 23 Oct 2017 23:37:42 -0400","from b01ledav004.gho.pok.ibm.com (b01ledav004.gho.pok.ibm.com\n\t[9.57.199.109])\n\tby b01cxnp22034.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP\n\tid v9O3bg4D51970098; Tue, 24 Oct 2017 03:37:42 GMT","from b01ledav004.gho.pok.ibm.com (unknown [127.0.0.1])\n\tby IMSVA (Postfix) with ESMTP id 1BDF2112054;\n\tMon, 23 Oct 2017 23:37:13 -0400 (EDT)","from [9.85.148.119] (unknown [9.85.148.119])\n\tby b01ledav004.gho.pok.ibm.com (Postfix) with ESMTP id AB13F112040;\n\tMon, 23 Oct 2017 23:37:09 -0400 (EDT)"],"Authentication-Results":"ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=linux.vnet.ibm.com\n\t(client-ip=148.163.158.5; helo=mx0a-001b2d01.pphosted.com;\n\tenvelope-from=aneesh.kumar@linux.vnet.ibm.com; receiver=<UNKNOWN>)","Subject":"Re: [PATCH 4/7] powerpc: Free up four 64K PTE bits in 64K backed\n\tHPTE pages","To":"Ram Pai <linuxram@us.ibm.com>,\n\tBenjamin Herrenschmidt <benh@kernel.crashing.org>","References":"<1504910713-7094-1-git-send-email-linuxram@us.ibm.com>\n\t<1504910713-7094-5-git-send-email-linuxram@us.ibm.com>\n\t<1505376837.12628.192.camel@kernel.crashing.org>\n\t<20171023192236.GA5488@ram.oc3035372033.ibm.com>","From":"\"Aneesh Kumar K.V\" <aneesh.kumar@linux.vnet.ibm.com>","Date":"Tue, 24 Oct 2017 09:07:36 +0530","User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101\n\tThunderbird/52.3.0","MIME-Version":"1.0","In-Reply-To":"<20171023192236.GA5488@ram.oc3035372033.ibm.com>","Content-Type":"text/plain; charset=utf-8; format=flowed","Content-Language":"en-US","Content-Transfer-Encoding":"7bit","X-TM-AS-GCONF":"00","x-cbid":"17102403-0052-0000-0000-000002762EAD","X-IBM-SpamModules-Scores":"","X-IBM-SpamModules-Versions":"BY=3.00007942; HX=3.00000241; KW=3.00000007;\n\tPH=3.00000004; SC=3.00000239; SDB=6.00935584; UDB=6.00471377;\n\tIPR=6.00715850; \n\tBA=6.00005656; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009;\n\tZB=6.00000000; \n\tZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00017680;\n\tXFM=3.00000015; UTC=2017-10-24 03:37:44","X-IBM-AV-DETECTION":"SAVI=unused REMOTE=unused XFE=unused","x-cbparentid":"17102403-0053-0000-0000-00005269B97C","Message-Id":"<2ca10f6f-972f-b7f9-fcce-81becad58c0c@linux.vnet.ibm.com>","X-Proofpoint-Virus-Version":"vendor=fsecure engine=2.50.10432:, ,\n\tdefinitions=2017-10-24_01:, , signatures=0","X-Proofpoint-Spam-Details":"rule=outbound_notspam policy=outbound score=0\n\tspamscore=0 suspectscore=0\n\tmalwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam\n\tadjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000\n\tdefinitions=main-1710240050","X-BeenThere":"linuxppc-dev@lists.ozlabs.org","X-Mailman-Version":"2.1.24","Precedence":"list","List-Id":"Linux on PowerPC Developers Mail List\n\t<linuxppc-dev.lists.ozlabs.org>","List-Unsubscribe":"<https://lists.ozlabs.org/options/linuxppc-dev>,\n\t<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=unsubscribe>","List-Archive":"<http://lists.ozlabs.org/pipermail/linuxppc-dev/>","List-Post":"<mailto:linuxppc-dev@lists.ozlabs.org>","List-Help":"<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=help>","List-Subscribe":"<https://lists.ozlabs.org/listinfo/linuxppc-dev>,\n\t<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=subscribe>","Cc":"mhocko@kernel.org, paulus@samba.org, ebiederm@xmission.com,\n\tbauerman@linux.vnet.ibm.com, linuxppc-dev@lists.ozlabs.org,\n\tkhandual@linux.vnet.ibm.com","Errors-To":"linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org","Sender":"\"Linuxppc-dev\"\n\t<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>"}}]