Patchwork [-V2,07/26] powerpc: Add size argument to pgtable_cache_add

login
register
mail settings
Submitter Aneesh Kumar K.V
Date March 6, 2013, 6:10 a.m.
Message ID <1362550227-575-8-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
Download mbox | patch
Permalink /patch/225348/
State Superseded, archived
Headers show

Comments

Aneesh Kumar K.V - March 6, 2013, 6:10 a.m.
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>

We will use this later with THP changes to request for pmd table of double the size.
THP code does PTE page allocation along with large page request and deposit them
for later use. This is to ensure that we won't have any failures when we split
huge pages to regular pages.

On powerpc we want to use the deposited PTE page for storing hash pte slot and
secondary bit information for the HPTEs. Hence we save them in the second half
of the pmd table.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/pgtable-ppc64.h |    7 ++++++-
 arch/powerpc/mm/init_64.c                |    4 ++--
 2 files changed, 8 insertions(+), 3 deletions(-)
Paul Mackerras - March 13, 2013, 2:47 a.m.
On Wed, Mar 06, 2013 at 11:40:08AM +0530, Aneesh Kumar K.V wrote:
> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
> 
> We will use this later with THP changes to request for pmd table of double the size.
> THP code does PTE page allocation along with large page request and deposit them
> for later use. This is to ensure that we won't have any failures when we split
> huge pages to regular pages.
> 
> On powerpc we want to use the deposited PTE page for storing hash pte slot and
> secondary bit information for the HPTEs. Hence we save them in the second half
> of the pmd table.

That much is fine, but...

> -void pgtable_cache_add(unsigned shift, void (*ctor)(void *));
> +extern void __pgtable_cache_add(unsigned shift, unsigned long table_size,
> +				void (*ctor)(void *));
>  void pgtable_cache_init(void);
> +static inline void pgtable_cache_add(unsigned shift, void (*ctor)(void *))
> +{
> +	return __pgtable_cache_add(shift, sizeof(void *) << shift, ctor);
> +}

NAK to this (and in fact the rest of this patch).

If you need twice the size, use the next slot along in pgtable_cache,
which will give you the size you need, rather than breaking the
relationship between slot index and allocation size.

Paul.
Aneesh Kumar K.V - March 13, 2013, 9:25 a.m.
Paul Mackerras <paulus@samba.org> writes:

> On Wed, Mar 06, 2013 at 11:40:08AM +0530, Aneesh Kumar K.V wrote:
>> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
>> 
>> We will use this later with THP changes to request for pmd table of double the size.
>> THP code does PTE page allocation along with large page request and deposit them
>> for later use. This is to ensure that we won't have any failures when we split
>> huge pages to regular pages.
>> 
>> On powerpc we want to use the deposited PTE page for storing hash pte slot and
>> secondary bit information for the HPTEs. Hence we save them in the second half
>> of the pmd table.
>
> That much is fine, but...
>
>> -void pgtable_cache_add(unsigned shift, void (*ctor)(void *));
>> +extern void __pgtable_cache_add(unsigned shift, unsigned long table_size,
>> +				void (*ctor)(void *));
>>  void pgtable_cache_init(void);
>> +static inline void pgtable_cache_add(unsigned shift, void (*ctor)(void *))
>> +{
>> +	return __pgtable_cache_add(shift, sizeof(void *) << shift, ctor);
>> +}
>
> NAK to this (and in fact the rest of this patch).
>
> If you need twice the size, use the next slot along in pgtable_cache,
> which will give you the size you need, rather than breaking the
> relationship between slot index and allocation size.

That would mean we will have to do 

 static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long addr)
 {
-	return kmem_cache_alloc(PGT_CACHE(PMD_INDEX_SIZE),
+	return kmem_cache_alloc(PGT_CACHE(PMD_INDEX_SIZE + 1),
 				GFP_KERNEL|__GFP_REPEAT);
 }

is that ok ?

-aneesh
Paul Mackerras - March 13, 2013, 11:34 a.m.
On Wed, Mar 13, 2013 at 02:55:17PM +0530, Aneesh Kumar K.V wrote:
> Paul Mackerras <paulus@samba.org> writes:
> > NAK to this (and in fact the rest of this patch).
> >
> > If you need twice the size, use the next slot along in pgtable_cache,
> > which will give you the size you need, rather than breaking the
> > relationship between slot index and allocation size.
> 
> That would mean we will have to do 
> 
>  static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long addr)
>  {
> -	return kmem_cache_alloc(PGT_CACHE(PMD_INDEX_SIZE),
> +	return kmem_cache_alloc(PGT_CACHE(PMD_INDEX_SIZE + 1),
>  				GFP_KERNEL|__GFP_REPEAT);
>  }
> 
> is that ok ?

I would define a symbol such as PMD_CACHE_INDEX and arrange for that
to be either PMD_INDEX_SIZE or PMD_INDEX_SIZE + 1, as needed, and then
use PGT_CACHE(PMD_CACHE_INDEX) everywhere instead of
PGT_CACHE(PMD_INDEX_SIZE).  If you want, you could also do
PUD_CACHE_INDEX and PGD_CACHE_INDEX for consistency, but it's not
totally necessary.

Paul.

Patch

diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/pgtable-ppc64.h
index 0182c20..d51d893 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64.h
@@ -338,8 +338,13 @@  static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
 #define pgoff_to_pte(off)	((pte_t) {((off) << PTE_RPN_SHIFT)|_PAGE_FILE})
 #define PTE_FILE_MAX_BITS	(BITS_PER_LONG - PTE_RPN_SHIFT)
 
-void pgtable_cache_add(unsigned shift, void (*ctor)(void *));
+extern void __pgtable_cache_add(unsigned shift, unsigned long table_size,
+				void (*ctor)(void *));
 void pgtable_cache_init(void);
+static inline void pgtable_cache_add(unsigned shift, void (*ctor)(void *))
+{
+	return __pgtable_cache_add(shift, sizeof(void *) << shift, ctor);
+}
 
 /*
  * find_linux_pte returns the address of a linux pte for a given
diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index 95a4529..d6df419 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -100,10 +100,10 @@  struct kmem_cache *pgtable_cache[MAX_PGTABLE_INDEX_SIZE];
  * everything else.  Caches created by this function are used for all
  * the higher level pagetables, and for hugepage pagetables.
  */
-void pgtable_cache_add(unsigned shift, void (*ctor)(void *))
+void __pgtable_cache_add(unsigned int shift, unsigned long table_size,
+			 void (*ctor)(void *))
 {
 	char *name;
-	unsigned long table_size = sizeof(void *) << shift;
 	unsigned long align = table_size;
 
 	/* When batching pgtable pointers for RCU freeing, we store