Patchwork [-V3,11/11] arch/powerpc: Add 64TB support

login
register
mail settings
Submitter Aneesh Kumar K.V
Date July 9, 2012, 1:13 p.m.
Message ID <1341839621-28332-12-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
Download mbox | patch
Permalink /patch/169832/
State Changes Requested
Delegated to: Benjamin Herrenschmidt
Headers show

Comments

Aneesh Kumar K.V - July 9, 2012, 1:13 p.m.
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>

Increase max addressable range to 64TB. This is not tested on
real hardware yet.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/mmu-hash64.h        |    8 ++++----
 arch/powerpc/include/asm/pgtable-ppc64-4k.h  |    2 +-
 arch/powerpc/include/asm/pgtable-ppc64-64k.h |    2 +-
 arch/powerpc/include/asm/processor.h         |    4 ++--
 arch/powerpc/include/asm/sparsemem.h         |    4 ++--
 5 files changed, 10 insertions(+), 10 deletions(-)
Paul Mackerras - July 23, 2012, 12:15 a.m.
On Mon, Jul 09, 2012 at 06:43:41PM +0530, Aneesh Kumar K.V wrote:
> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
> 
> Increase max addressable range to 64TB. This is not tested on
> real hardware yet.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
>  arch/powerpc/include/asm/mmu-hash64.h        |    8 ++++----
>  arch/powerpc/include/asm/pgtable-ppc64-4k.h  |    2 +-
>  arch/powerpc/include/asm/pgtable-ppc64-64k.h |    2 +-
>  arch/powerpc/include/asm/processor.h         |    4 ++--
>  arch/powerpc/include/asm/sparsemem.h         |    4 ++--
>  5 files changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/mmu-hash64.h b/arch/powerpc/include/asm/mmu-hash64.h
> index aa0d560..a227ba7 100644
> --- a/arch/powerpc/include/asm/mmu-hash64.h
> +++ b/arch/powerpc/include/asm/mmu-hash64.h
> @@ -374,16 +374,16 @@ extern void slb_set_size(u16 size);
>   */
>  
>  #define VSID_MULTIPLIER_256M	ASM_CONST(200730139)	/* 28-bit prime */
> -#define VSID_BITS_256M		36
> +#define VSID_BITS_256M		38
>  #define VSID_MODULUS_256M	((1UL<<VSID_BITS_256M)-1)

With these settings, the multiplication in ASM_VSID_SCRAMBLE could
overflow, leading to incorrect results (which would cause occasional
corruption of user processes under heavy load).  You will need to
reduce the multiplier to be less than 2^26, and it will need to be
co-prime with 2^38 - 1.  (Probably, the same value as we use in the 1T
case would be OK.)

Paul.
Aneesh Kumar K.V - July 23, 2012, 8:49 a.m.
Paul Mackerras <paulus@samba.org> writes:

> On Mon, Jul 09, 2012 at 06:43:41PM +0530, Aneesh Kumar K.V wrote:
>> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
>> 
>> Increase max addressable range to 64TB. This is not tested on
>> real hardware yet.
>> 
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
>> ---
>>  arch/powerpc/include/asm/mmu-hash64.h        |    8 ++++----
>>  arch/powerpc/include/asm/pgtable-ppc64-4k.h  |    2 +-
>>  arch/powerpc/include/asm/pgtable-ppc64-64k.h |    2 +-
>>  arch/powerpc/include/asm/processor.h         |    4 ++--
>>  arch/powerpc/include/asm/sparsemem.h         |    4 ++--
>>  5 files changed, 10 insertions(+), 10 deletions(-)
>> 
>> diff --git a/arch/powerpc/include/asm/mmu-hash64.h b/arch/powerpc/include/asm/mmu-hash64.h
>> index aa0d560..a227ba7 100644
>> --- a/arch/powerpc/include/asm/mmu-hash64.h
>> +++ b/arch/powerpc/include/asm/mmu-hash64.h
>> @@ -374,16 +374,16 @@ extern void slb_set_size(u16 size);
>>   */
>>  
>>  #define VSID_MULTIPLIER_256M	ASM_CONST(200730139)	/* 28-bit prime */
>> -#define VSID_BITS_256M		36
>> +#define VSID_BITS_256M		38
>>  #define VSID_MODULUS_256M	((1UL<<VSID_BITS_256M)-1)
>
> With these settings, the multiplication in ASM_VSID_SCRAMBLE could
> overflow, leading to incorrect results (which would cause occasional
> corruption of user processes under heavy load).  You will need to
> reduce the multiplier to be less than 2^26, and it will need to be
> co-prime with 2^38 - 1.  (Probably, the same value as we use in the 1T
> case would be OK.)

I ended up using the same value as VSID_MULTIPLIER_1T. 


(gdb) p/d 1ull << 38
$1 = 274877906944
(gdb) p/d 274877906943/12538073
$2 = 21923
(gdb) p/d 12538073ll*21923
$5 = 274872174379
Paul Mackerras - July 23, 2012, 9:39 a.m.
On Mon, Jul 09, 2012 at 06:43:41PM +0530, Aneesh Kumar K.V wrote:

> -#define USER_ESID_BITS		16
> -#define USER_ESID_BITS_1T	4
> +#define USER_ESID_BITS		18
> +#define USER_ESID_BITS_1T	6

You also need to change the proto-VSID generation for kernel addresses
when you do this.  If you don't you'll end up with some user processes
using the same VSIDs as we use for the kernel addresses, meaning that
those processes won't run very well...

Paul.
Aneesh Kumar K.V - July 23, 2012, 10:22 a.m.
Paul Mackerras <paulus@samba.org> writes:

> On Mon, Jul 09, 2012 at 06:43:41PM +0530, Aneesh Kumar K.V wrote:
>
>> -#define USER_ESID_BITS		16
>> -#define USER_ESID_BITS_1T	4
>> +#define USER_ESID_BITS		18
>> +#define USER_ESID_BITS_1T	6
>
> You also need to change the proto-VSID generation for kernel addresses
> when you do this.  If you don't you'll end up with some user processes
> using the same VSIDs as we use for the kernel addresses, meaning that
> those processes won't run very well...
>

Can you explain this more. right now we generate vsid as below

vsid_scramble(ea >> SID_SHIFT, 256M) for kernel

vsid_scramble((context << USER_ESID_BITS) | (ea >> SID_SHIFT), 256M);
for user

what changes are you suggesting ?

-aneesh
Paul Mackerras - July 23, 2012, 11:06 a.m.
On Mon, Jul 23, 2012 at 03:52:05PM +0530, Aneesh Kumar K.V wrote:
> Paul Mackerras <paulus@samba.org> writes:
> 
> > On Mon, Jul 09, 2012 at 06:43:41PM +0530, Aneesh Kumar K.V wrote:
> >
> >> -#define USER_ESID_BITS		16
> >> -#define USER_ESID_BITS_1T	4
> >> +#define USER_ESID_BITS		18
> >> +#define USER_ESID_BITS_1T	6
> >
> > You also need to change the proto-VSID generation for kernel addresses
> > when you do this.  If you don't you'll end up with some user processes
> > using the same VSIDs as we use for the kernel addresses, meaning that
> > those processes won't run very well...
> >
> 
> Can you explain this more. right now we generate vsid as below
> 
> vsid_scramble(ea >> SID_SHIFT, 256M) for kernel
> 
> vsid_scramble((context << USER_ESID_BITS) | (ea >> SID_SHIFT), 256M);
> for user
> 
> what changes are you suggesting ?

Think about it.  With the current values of USER_ESID_BITS and
CONTEXT_BITS, and the addresses we use for kernel mappings, there are
no values of context, user_ea and kernel_ea for which

kernel_ea >> SID_SHIFT == (context << USER_ESID_BITS) | (user_ea >> SID_SHIFT)

If you increase USER_ESID_BITS, then there will be some context values
for which that equation becomes true.  For example, if you increase
USER_ESID_BITS to 18, then context 0x30000 will generate the same
proto-VSIDs as the kernel linear mapping.  Since we can hand out
contexts up to 0x7ffff (with CONTEXT_BITS = 19), there is a collision.

In other words, the proto-VSID space (the space of values that are
input to vsid_scramble) is currently divided into two mutually
exclusive regions: from 0 to 2^35 - 1 for user processes, and from
2^35 to 2^36 - 1 for kernel addresses.  You are wanting to expand the
amount of proto-VSID space that user processes can use, but you need
either to move the kernel portion of the space, or to make sure that
the context allocator doesn't hand out context values that would
collide with the kernel portion of the space (or both).

Paul.

Patch

diff --git a/arch/powerpc/include/asm/mmu-hash64.h b/arch/powerpc/include/asm/mmu-hash64.h
index aa0d560..a227ba7 100644
--- a/arch/powerpc/include/asm/mmu-hash64.h
+++ b/arch/powerpc/include/asm/mmu-hash64.h
@@ -374,16 +374,16 @@  extern void slb_set_size(u16 size);
  */
 
 #define VSID_MULTIPLIER_256M	ASM_CONST(200730139)	/* 28-bit prime */
-#define VSID_BITS_256M		36
+#define VSID_BITS_256M		38
 #define VSID_MODULUS_256M	((1UL<<VSID_BITS_256M)-1)
 
 #define VSID_MULTIPLIER_1T	ASM_CONST(12538073)	/* 24-bit prime */
-#define VSID_BITS_1T		24
+#define VSID_BITS_1T		26
 #define VSID_MODULUS_1T		((1UL<<VSID_BITS_1T)-1)
 
 #define CONTEXT_BITS		19
-#define USER_ESID_BITS		16
-#define USER_ESID_BITS_1T	4
+#define USER_ESID_BITS		18
+#define USER_ESID_BITS_1T	6
 
 #define USER_VSID_RANGE	(1UL << (USER_ESID_BITS + SID_SHIFT))
 
diff --git a/arch/powerpc/include/asm/pgtable-ppc64-4k.h b/arch/powerpc/include/asm/pgtable-ppc64-4k.h
index 6eefdcf..b3eccf2 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64-4k.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64-4k.h
@@ -7,7 +7,7 @@ 
  */
 #define PTE_INDEX_SIZE  9
 #define PMD_INDEX_SIZE  7
-#define PUD_INDEX_SIZE  7
+#define PUD_INDEX_SIZE  9
 #define PGD_INDEX_SIZE  9
 
 #ifndef __ASSEMBLY__
diff --git a/arch/powerpc/include/asm/pgtable-ppc64-64k.h b/arch/powerpc/include/asm/pgtable-ppc64-64k.h
index 90533dd..be4e287 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64-64k.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64-64k.h
@@ -7,7 +7,7 @@ 
 #define PTE_INDEX_SIZE  12
 #define PMD_INDEX_SIZE  12
 #define PUD_INDEX_SIZE	0
-#define PGD_INDEX_SIZE  4
+#define PGD_INDEX_SIZE  6
 
 #ifndef __ASSEMBLY__
 #define PTE_TABLE_SIZE	(sizeof(real_pte_t) << PTE_INDEX_SIZE)
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index 413a5ea..ac3861b 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -97,8 +97,8 @@  extern struct task_struct *last_task_used_spe;
 #endif
 
 #ifdef CONFIG_PPC64
-/* 64-bit user address space is 44-bits (16TB user VM) */
-#define TASK_SIZE_USER64 (0x0000100000000000UL)
+/* 64-bit user address space is 46-bits (64TB user VM) */
+#define TASK_SIZE_USER64 (0x0000400000000000UL)
 
 /* 
  * 32-bit user address space is 4GB - 1 page 
diff --git a/arch/powerpc/include/asm/sparsemem.h b/arch/powerpc/include/asm/sparsemem.h
index 0c5fa31..f6fc0ee 100644
--- a/arch/powerpc/include/asm/sparsemem.h
+++ b/arch/powerpc/include/asm/sparsemem.h
@@ -10,8 +10,8 @@ 
  */
 #define SECTION_SIZE_BITS       24
 
-#define MAX_PHYSADDR_BITS       44
-#define MAX_PHYSMEM_BITS        44
+#define MAX_PHYSADDR_BITS       46
+#define MAX_PHYSMEM_BITS        46
 
 #endif /* CONFIG_SPARSEMEM */