diff mbox

[1/2] powerpc: add 16K/64K pages support for the 44x PPC32 architectures.

Message ID 1224123753-20907-2-git-send-email-yanok@emcraft.com (mailing list archive)
State Superseded, archived
Delegated to: Josh Boyer
Headers show

Commit Message

Ilya Yanok Oct. 16, 2008, 2:22 a.m. UTC
This patch adds support for page sizes bigger than 4K (16K/64K) on
PPC 44x.

Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
Signed-off-by: Vladimir Panfilov <pvr@emcraft.com>
Signed-off-by: Ilya Yanok <yanok@emcraft.com>
---
 arch/powerpc/Kconfig                   |   26 ++++++++++++++++++++------
 arch/powerpc/include/asm/highmem.h     |    8 +++++++-
 arch/powerpc/include/asm/mmu-44x.h     |   18 ++++++++++++++++++
 arch/powerpc/include/asm/page.h        |   13 ++++++++-----
 arch/powerpc/include/asm/pgtable.h     |    3 +++
 arch/powerpc/kernel/asm-offsets.c      |    4 ++++
 arch/powerpc/kernel/head_44x.S         |   22 +++++++++++++---------
 arch/powerpc/kernel/misc_32.S          |   12 ++++++------
 arch/powerpc/mm/pgtable_32.c           |    9 ++-------
 arch/powerpc/platforms/Kconfig.cputype |    2 +-
 10 files changed, 82 insertions(+), 35 deletions(-)

Comments

prodyut hazarika Oct. 17, 2008, 3:54 p.m. UTC | #1
On Wed, Oct 15, 2008 at 7:22 PM, Ilya Yanok <yanok@emcraft.com> wrote:
> This patch adds support for page sizes bigger than 4K (16K/64K) on
> PPC 44x.
>

This patch looks good to me. Seems that all the review comments have
been incorporated.

Josh, it would be great if this patch is pulled into the mainline
kernel. I have seen significant performance improvement with RAID0/5
by using 64K pages.
Josh Boyer Oct. 18, 2008, 12:58 p.m. UTC | #2
On Fri, 17 Oct 2008 08:54:52 -0700
"prodyut hazarika" <prodyuth@gmail.com> wrote:

> On Wed, Oct 15, 2008 at 7:22 PM, Ilya Yanok <yanok@emcraft.com> wrote:
> > This patch adds support for page sizes bigger than 4K (16K/64K) on
> > PPC 44x.
> >
> 
> This patch looks good to me. Seems that all the review comments have
> been incorporated.
> 
> Josh, it would be great if this patch is pulled into the mainline
> kernel. I have seen significant performance improvement with RAID0/5
> by using 64K pages.

It helps if you CC the person you're writing too :).

Anyway, I looked over it briefly and agree it looks pretty good.  A bit
late for 2.6.28, but I'll do a more thorough review and get it in for
2.6.29.

josh
prodyut hazarika Oct. 18, 2008, 8:36 p.m. UTC | #3
> It helps if you CC the person you're writing too :).
Thanks Josh for pointing this out :-) I will be careful in future.

> Anyway, I looked over it briefly and agree it looks pretty good.  A bit
> late for 2.6.28, but I'll do a more thorough review and get it in for
> 2.6.29.
>
Great. Look forward to seeing this on the mainline kernel.
ehrhardt@linux.vnet.ibm.com Oct. 22, 2008, 2:28 p.m. UTC | #4
Hi Ilya,
I just tried your patch on my 440 board because it would help us in our 
environment.
Unfortunately I run into a bug on early boot (mark_bootmem).

A log can be found in this mail, this is the bug when running with 64k 
page size.
I tried this with and without your 2/2 265k patch and also with page 
size configured to 16k, the error is the same in all cases.

I used an earlier version of your patch in the past and it worked fine. 
Applying this old patch causes the same problem.
Therefore I expect that there was some other code changed that breaks 
with page size != 4k.

I did not check that in detail yet, but I would be happy for every hint 
I could get to fix this.

=> bootm
## Booting kernel from Legacy Image at 04000000 ...
   Image Name:   Linux-2.6.27-dirty
   Image Type:   PowerPC Linux Kernel Image (gzip compressed)
   Data Size:    1512203 Bytes =  1.4 MB
   Load Address: 00400000
   Entry Point:  00400458
   Verifying Checksum ... OK
   Uncompressing Kernel Image ... OK
CPU clock-frequency <- 0x27bc86a4 (667MHz)
CPU timebase-frequency <- 0x27bc86a4 (667MHz)
/plb: clock-frequency <- 9ef21a9 (167MHz)
/plb/opb: clock-frequency <- 4f790d4 (83MHz)
/plb/opb/ebc: clock-frequency <- 34fb5e3 (56MHz)
/plb/opb/serial@ef600300: clock-frequency <- a8c000 (11MHz)
/plb/opb/serial@ef600400: clock-frequency <- a8c000 (11MHz)
/plb/opb/serial@ef600500: clock-frequency <- 42ecac (4MHz)
/plb/opb/serial@ef600600: clock-frequency <- 42ecac (4MHz)
Memory <- <0x0 0x0 0xffff000> (255MB)
ethernet0: local-mac-address <- 00:10:ec:00:e2:3e
ethernet1: local-mac-address <- 00:10:ec:80:e2:3e

zImage starting: loaded at 0x00400000 (sp: 0x0fe3c820)
Allocating 0x3c54dc bytes for kernel ...
gunzipping (0x00000000 <- 0x0040e000:0x007a2428)...done 0x380a90 bytes

Linux/PowerPC load: console=ttyS0,115200 ip=dhcp 
nfsroot=192.168.1.2:/home/paelzer/ubuntu_ppc.8.04 root=/dev/nfs rw
Finalizing device tree... flat tree at 0x40bed8
Using PowerPC 44x Platform machine description
Linux version 2.6.27-dirty (paelzer@HelionPrime) (gcc version 4.2.3) #5 
Wed Oct 22 15:15:40 CEST 2008
console [udbg0] enabled
------------[ cut here ]------------
Kernel BUG at c02be6cc [verbose debug info unavailable]
Oops: Exception in kernel mode, sig: 5 [#1]
PowerPC 44x Platform
NIP: c02be6cc LR: c02ba4e4 CTR: 00000000
REGS: c0351eb0 TRAP: 0700   Not tainted  (2.6.27-dirty)
MSR: 00021000 <ME>  CR: 22004022  XER: 0000005f
TASK = c03204a8[0] 'swapper' THREAD: c0350000
GPR00: c02d0a1c c0351f60 c03204a8 00000fff 00001000 00000001 00000000 
00000000
GPR08: e0000000 00000000 ffffffff c02d0a14 22000024 00000000 0ffa6800 
0ffbf000
GPR16: c02ed838 bfe8f45c 00000000 00000000 0ffa7500 0fe3cb20 00000001 
c02d0a1c
GPR24: 00000000 00000001 00001000 00000fff c0390000 00000fff c039d1d0 
c02d0a08
NIP [c02be6cc] mark_bootmem+0xe0/0x124
LR [c02ba4e4] do_init_bootmem+0x134/0x168
Call Trace:
[c0351f60] [c02be6a4] mark_bootmem+0xb8/0x124 (unreliable)
[c0351f90] [c02ba4e4] do_init_bootmem+0x134/0x168
[c0351fb0] [c02b8e00] setup_arch+0x13c/0x1b8
[c0351fc0] [c02b066c] start_kernel+0x94/0x2ac
[c0351ff0] [c00001e8] skpinv+0x190/0x1cc
Instruction dump:
7f07c378 4bfffe15 7c7e1b78 4192000c 2f830000 409e0024 7f9ae000 419e0050
817f0014 83bf0004 3bebffec 4bffff68 <0fe00000> 48000000 7f63db78 7fa4eb78
---[ end trace 31fd0ba7d8756001 ]---
Kernel panic - not syncing: Attempted to kill the idle task!
Rebooting in 180 seconds..


Ilya Yanok wrote:
> This patch adds support for page sizes bigger than 4K (16K/64K) on
> PPC 44x.
>
> Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
> Signed-off-by: Vladimir Panfilov <pvr@emcraft.com>
> Signed-off-by: Ilya Yanok <yanok@emcraft.com>
> ---
>  arch/powerpc/Kconfig                   |   26 ++++++++++++++++++++------
>  arch/powerpc/include/asm/highmem.h     |    8 +++++++-
>  arch/powerpc/include/asm/mmu-44x.h     |   18 ++++++++++++++++++
>  arch/powerpc/include/asm/page.h        |   13 ++++++++-----
>  arch/powerpc/include/asm/pgtable.h     |    3 +++
>  arch/powerpc/kernel/asm-offsets.c      |    4 ++++
>  arch/powerpc/kernel/head_44x.S         |   22 +++++++++++++---------
>  arch/powerpc/kernel/misc_32.S          |   12 ++++++------
>  arch/powerpc/mm/pgtable_32.c           |    9 ++-------
>  arch/powerpc/platforms/Kconfig.cputype |    2 +-
>  10 files changed, 82 insertions(+), 35 deletions(-)
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 587da5e..9627cfd 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -402,16 +402,30 @@ config PPC_HAS_HASH_64K
>  	depends on PPC64
>  	default n
>
> -config PPC_64K_PAGES
> -	bool "64k page size"
> -	depends on PPC64
> -	select PPC_HAS_HASH_64K
> +choice
> +	prompt "Page size"
> +	default PPC_4K_PAGES
>  	help
> -	  This option changes the kernel logical page size to 64k. On machines
> +	  The PAGE_SIZE definition. Increasing the page size may
> +	  improve the system performance in some dedicated cases like software
> +	  RAID with accelerated calculations. In PPC64 case on machines
>  	  without processor support for 64k pages, the kernel will simulate
>  	  them by loading each individual 4k page on demand transparently,
>  	  while on hardware with such support, it will be used to map
>  	  normal application pages.
> +	  If unsure, set it to 4 KB.
> +
> +config PPC_4K_PAGES
> +	bool "4k page size"
> +
> +config PPC_16K_PAGES
> +	bool "16k page size" if 44x
> +
> +config PPC_64K_PAGES
> +	bool "64k page size" if 44x || PPC64
> +	select PPC_HAS_HASH_64K if PPC64
> +
> +endchoice
>
>  config FORCE_MAX_ZONEORDER
>  	int "Maximum zone order"
> @@ -435,7 +449,7 @@ config FORCE_MAX_ZONEORDER
>
>  config PPC_SUBPAGE_PROT
>  	bool "Support setting protections for 4k subpages"
> -	depends on PPC_64K_PAGES
> +	depends on PPC64 && PPC_64K_PAGES
>  	help
>  	  This option adds support for a system call to allow user programs
>  	  to set access permissions (read/write, readonly, or no access)
> diff --git a/arch/powerpc/include/asm/highmem.h b/arch/powerpc/include/asm/highmem.h
> index 5d99b64..dc1132c 100644
> --- a/arch/powerpc/include/asm/highmem.h
> +++ b/arch/powerpc/include/asm/highmem.h
> @@ -38,9 +38,15 @@ extern pte_t *pkmap_page_table;
>   * easily, subsequent pte tables have to be allocated in one physical
>   * chunk of RAM.
>   */
> +#if defined(CONFIG_PPC_64K_PAGES) && !defined(CONFIG_PPC64)
> +#define PKMAP_ORDER	(27 - PAGE_SHIFT)
> +#define LAST_PKMAP	(1 << PKMAP_ORDER)
> +#define PKMAP_BASE	(FIXADDR_START - PAGE_SIZE*(LAST_PKMAP + 1))
> +#else
>  #define LAST_PKMAP 	(1 << PTE_SHIFT)
> -#define LAST_PKMAP_MASK (LAST_PKMAP-1)
>  #define PKMAP_BASE	((FIXADDR_START - PAGE_SIZE*(LAST_PKMAP + 1)) & PMD_MASK)
> +#endif
> +#define LAST_PKMAP_MASK	(LAST_PKMAP-1)
>  #define PKMAP_NR(virt)  ((virt-PKMAP_BASE) >> PAGE_SHIFT)
>  #define PKMAP_ADDR(nr)  (PKMAP_BASE + ((nr) << PAGE_SHIFT))
>
> diff --git a/arch/powerpc/include/asm/mmu-44x.h b/arch/powerpc/include/asm/mmu-44x.h
> index a825524..2ca18e8 100644
> --- a/arch/powerpc/include/asm/mmu-44x.h
> +++ b/arch/powerpc/include/asm/mmu-44x.h
> @@ -4,6 +4,8 @@
>   * PPC440 support
>   */
>
> +#include <asm/page.h>
> +
>  #define PPC44x_MMUCR_TID	0x000000ff
>  #define PPC44x_MMUCR_STS	0x00010000
>
> @@ -73,4 +75,20 @@ typedef struct {
>  /* Size of the TLBs used for pinning in lowmem */
>  #define PPC_PIN_SIZE	(1 << 28)	/* 256M */
>
> +#if (PAGE_SHIFT == 12)
> +#define PPC44x_TLBE_SIZE	PPC44x_TLB_4K
> +#elif (PAGE_SHIFT == 14)
> +#define PPC44x_TLBE_SIZE	PPC44x_TLB_16K
> +#elif (PAGE_SHIFT == 16)
> +#define PPC44x_TLBE_SIZE	PPC44x_TLB_64K
> +#else
> +#error "Unsupported PAGE_SIZE"
> +#endif
> +
> +#define PPC44x_PGD_OFF_SHIFT	(32 - PMD_SHIFT + 2)
> +#define PPC44x_PGD_OFF_MASK	(PMD_SHIFT - 2)
> +#define PPC44x_PTE_ADD_SHIFT	(32 - PMD_SHIFT + PTE_SHIFT + 3)
> +#define PPC44x_PTE_ADD_MASK	(32 - 3 - PTE_SHIFT)
> +#define PPC44x_RPN_MASK		(31 - PAGE_SHIFT)
> +
>  #endif /* _ASM_POWERPC_MMU_44X_H_ */
> diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
> index e088545..537d5b1 100644
> --- a/arch/powerpc/include/asm/page.h
> +++ b/arch/powerpc/include/asm/page.h
> @@ -15,12 +15,15 @@
>  #include <asm/types.h>
>
>  /*
> - * On PPC32 page size is 4K. For PPC64 we support either 4K or 64K software
> + * On regular PPC32 page size is 4K (but we support 4K/16K/64K pages
> + * on PPC44x). For PPC64 we support either 4K or 64K software
>   * page size. When using 64K pages however, whether we are really supporting
>   * 64K pages in HW or not is irrelevant to those definitions.
>   */
> -#ifdef CONFIG_PPC_64K_PAGES
> +#if defined(CONFIG_PPC_64K_PAGES)
>  #define PAGE_SHIFT		16
> +#elif defined(CONFIG_PPC_16K_PAGES)
> +#define PAGE_SHIFT		14
>  #else
>  #define PAGE_SHIFT		12
>  #endif
> @@ -140,7 +143,7 @@ typedef struct { pte_basic_t pte; } pte_t;
>  /* 64k pages additionally define a bigger "real PTE" type that gathers
>   * the "second half" part of the PTE for pseudo 64k pages
>   */
> -#ifdef CONFIG_PPC_64K_PAGES
> +#if defined(CONFIG_PPC_64K_PAGES) && defined(CONFIG_PPC64)
>  typedef struct { pte_t pte; unsigned long hidx; } real_pte_t;
>  #else
>  typedef struct { pte_t pte; } real_pte_t;
> @@ -180,10 +183,10 @@ typedef pte_basic_t pte_t;
>  #define pte_val(x)	(x)
>  #define __pte(x)	(x)
>
> -#ifdef CONFIG_PPC_64K_PAGES
> +#if defined(CONFIG_PPC_64K_PAGES) && defined(CONFIG_PPC64)
>  typedef struct { pte_t pte; unsigned long hidx; } real_pte_t;
>  #else
> -typedef unsigned long real_pte_t;
> +typedef pte_t real_pte_t;
>  #endif
>
>
> diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
> index dbb8ca1..0d447fb 100644
> --- a/arch/powerpc/include/asm/pgtable.h
> +++ b/arch/powerpc/include/asm/pgtable.h
> @@ -39,6 +39,9 @@ extern void paging_init(void);
>
>  #include <asm-generic/pgtable.h>
>
> +#define PGD_T_LOG2	(__builtin_ffs(sizeof(pgd_t)) - 1)
> +#define PMD_T_LOG2	(__builtin_ffs(sizeof(pmd_t)) - 1)
> +#define PTE_T_LOG2	(__builtin_ffs(sizeof(pte_t)) - 1)
>
>  /*
>   * This gets called at the end of handling a page fault, when
> diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
> index 92768d3..98b8bb6 100644
> --- a/arch/powerpc/kernel/asm-offsets.c
> +++ b/arch/powerpc/kernel/asm-offsets.c
> @@ -375,6 +375,10 @@ int main(void)
>  	DEFINE(VCPU_FAULT_DEAR, offsetof(struct kvm_vcpu, arch.fault_dear));
>  	DEFINE(VCPU_FAULT_ESR, offsetof(struct kvm_vcpu, arch.fault_esr));
>  #endif
> +#ifdef CONFIG_44x
> +	DEFINE(PMD_SHIFT, PMD_SHIFT);
> +	DEFINE(PTE_SHIFT, PTE_SHIFT);
> +#endif
>
>  	return 0;
>  }
> diff --git a/arch/powerpc/kernel/head_44x.S b/arch/powerpc/kernel/head_44x.S
> index f3a1ea9..6525124 100644
> --- a/arch/powerpc/kernel/head_44x.S
> +++ b/arch/powerpc/kernel/head_44x.S
> @@ -391,12 +391,14 @@ interrupt_base:
>  	rlwimi	r13,r12,10,30,30
>
>  	/* Load the PTE */
> -	rlwinm 	r12, r10, 13, 19, 29	/* Compute pgdir/pmd offset */
> +	/* Compute pgdir/pmd offset */
> +	rlwinm  r12, r10, PPC44x_PGD_OFF_SHIFT, PPC44x_PGD_OFF_MASK, 29
>  	lwzx	r11, r12, r11		/* Get pgd/pmd entry */
>  	rlwinm.	r12, r11, 0, 0, 20	/* Extract pt base address */
>  	beq	2f			/* Bail if no table */
>
> -	rlwimi	r12, r10, 23, 20, 28	/* Compute pte address */
> +	/* Compute pte address */
> +	rlwimi  r12, r10, PPC44x_PTE_ADD_SHIFT, PPC44x_PTE_ADD_MASK, 28
>  	lwz	r11, 0(r12)		/* Get high word of pte entry */
>  	lwz	r12, 4(r12)		/* Get low word of pte entry */
>
> @@ -485,12 +487,14 @@ tlb_44x_patch_hwater_D:
>  	/* Make up the required permissions */
>  	li	r13,_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_HWEXEC
>
> -	rlwinm	r12, r10, 13, 19, 29	/* Compute pgdir/pmd offset */
> +	/* Compute pgdir/pmd offset */
> +	rlwinm 	r12, r10, PPC44x_PGD_OFF_SHIFT, PPC44x_PGD_OFF_MASK, 29
>  	lwzx	r11, r12, r11		/* Get pgd/pmd entry */
>  	rlwinm.	r12, r11, 0, 0, 20	/* Extract pt base address */
>  	beq	2f			/* Bail if no table */
>
> -	rlwimi	r12, r10, 23, 20, 28	/* Compute pte address */
> +	/* Compute pte address */
> +	rlwimi	r12, r10, PPC44x_PTE_ADD_SHIFT, PPC44x_PTE_ADD_MASK, 28
>  	lwz	r11, 0(r12)		/* Get high word of pte entry */
>  	lwz	r12, 4(r12)		/* Get low word of pte entry */
>
> @@ -554,15 +558,15 @@ tlb_44x_patch_hwater_I:
>   */
>  finish_tlb_load:
>  	/* Combine RPN & ERPN an write WS 0 */
> -	rlwimi	r11,r12,0,0,19
> +	rlwimi	r11,r12,0,0,PPC44x_RPN_MASK
>  	tlbwe	r11,r13,PPC44x_TLB_XLAT
>
>  	/*
>  	 * Create WS1. This is the faulting address (EPN),
>  	 * page size, and valid flag.
>  	 */
> -	li	r11,PPC44x_TLB_VALID | PPC44x_TLB_4K
> -	rlwimi	r10,r11,0,20,31			/* Insert valid and page size*/
> +	li	r11,PPC44x_TLB_VALID | PPC44x_TLBE_SIZE
> +	rlwimi	r10,r11,0,PPC44x_PTE_ADD_MASK,31/* Insert valid and page size*/
>  	tlbwe	r10,r13,PPC44x_TLB_PAGEID	/* Write PAGEID */
>
>  	/* And WS 2 */
> @@ -634,12 +638,12 @@ _GLOBAL(set_context)
>   * goes at the beginning of the data segment, which is page-aligned.
>   */
>  	.data
> -	.align	12
> +	.align	PAGE_SHIFT
>  	.globl	sdata
>  sdata:
>  	.globl	empty_zero_page
>  empty_zero_page:
> -	.space	4096
> +	.space	PAGE_SIZE
>
>  /*
>   * To support >32-bit physical addresses, we use an 8KB pgdir.
> diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
> index 7a6dfbc..0110fcd 100644
> --- a/arch/powerpc/kernel/misc_32.S
> +++ b/arch/powerpc/kernel/misc_32.S
> @@ -589,8 +589,8 @@ _GLOBAL(__flush_dcache_icache)
>  BEGIN_FTR_SECTION
>  	blr
>  END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
> -	rlwinm	r3,r3,0,0,19			/* Get page base address */
> -	li	r4,4096/L1_CACHE_BYTES	/* Number of lines in a page */
> +	rlwinm	r3,r3,0,0,PPC44x_RPN_MASK	/* Get page base address */
> +	li	r4,PAGE_SIZE/L1_CACHE_BYTES	/* Number of lines in a page */
>  	mtctr	r4
>  	mr	r6,r3
>  0:	dcbst	0,r3				/* Write line to ram */
> @@ -630,8 +630,8 @@ END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
>  	rlwinm	r0,r10,0,28,26			/* clear DR */
>  	mtmsr	r0
>  	isync
> -	rlwinm	r3,r3,0,0,19			/* Get page base address */
> -	li	r4,4096/L1_CACHE_BYTES	/* Number of lines in a page */
> +	rlwinm	r3,r3,0,0,PPC44x_RPN_MASK	/* Get page base address */
> +	li	r4,PAGE_SIZE/L1_CACHE_BYTES	/* Number of lines in a page */
>  	mtctr	r4
>  	mr	r6,r3
>  0:	dcbst	0,r3				/* Write line to ram */
> @@ -655,7 +655,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
>   * void clear_pages(void *page, int order) ;
>   */
>  _GLOBAL(clear_pages)
> -	li	r0,4096/L1_CACHE_BYTES
> +	li	r0,PAGE_SIZE/L1_CACHE_BYTES
>  	slw	r0,r0,r4
>  	mtctr	r0
>  #ifdef CONFIG_8xx
> @@ -713,7 +713,7 @@ _GLOBAL(copy_page)
>  	dcbt	r5,r4
>  	li	r11,L1_CACHE_BYTES+4
>  #endif /* MAX_COPY_PREFETCH */
> -	li	r0,4096/L1_CACHE_BYTES - MAX_COPY_PREFETCH
> +	li	r0,PAGE_SIZE/L1_CACHE_BYTES - MAX_COPY_PREFETCH
>  	crclr	4*cr0+eq
>  2:
>  	mtctr	r0
> diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
> index 2001abd..4eed001 100644
> --- a/arch/powerpc/mm/pgtable_32.c
> +++ b/arch/powerpc/mm/pgtable_32.c
> @@ -72,12 +72,7 @@ extern unsigned long p_mapped_by_tlbcam(unsigned long pa);
>  #define p_mapped_by_tlbcam(x)	(0UL)
>  #endif /* HAVE_TLBCAM */
>
> -#ifdef CONFIG_PTE_64BIT
> -/* 44x uses an 8kB pgdir because it has 8-byte Linux PTEs. */
> -#define PGDIR_ORDER	1
> -#else
> -#define PGDIR_ORDER	0
> -#endif
> +#define PGDIR_ORDER	max(32 + PGD_T_LOG2 - PGDIR_SHIFT - PAGE_SHIFT, 0)
>
>  pgd_t *pgd_alloc(struct mm_struct *mm)
>  {
> @@ -400,7 +395,7 @@ void kernel_map_pages(struct page *page, int numpages, int enable)
>  #endif /* CONFIG_DEBUG_PAGEALLOC */
>
>  static int fixmaps;
> -unsigned long FIXADDR_TOP = 0xfffff000;
> +unsigned long FIXADDR_TOP = (-PAGE_SIZE);
>  EXPORT_SYMBOL(FIXADDR_TOP);
>
>  void __set_fixmap (enum fixed_addresses idx, phys_addr_t phys, pgprot_t flags)
> diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/platforms/Kconfig.cputype
> index 7f65127..a1386a4 100644
> --- a/arch/powerpc/platforms/Kconfig.cputype
> +++ b/arch/powerpc/platforms/Kconfig.cputype
> @@ -202,7 +202,7 @@ config PPC_STD_MMU_32
>
>  config PPC_MM_SLICES
>  	bool
> -	default y if HUGETLB_PAGE || PPC_64K_PAGES
> +	default y if HUGETLB_PAGE || (PPC64 && PPC_64K_PAGES)
>  	default n
>
>  config VIRT_CPU_ACCOUNTING
>
ehrhardt@linux.vnet.ibm.com Oct. 22, 2008, 5:54 p.m. UTC | #5
Ilya, here the snippet you asked for with CONFIG_DEBUG_BUGVERBOSE 
enabled and bootmem_debug set.

## Booting kernel from Legacy Image at 04000000 ...
   Image Name:   Linux-2.6.27-dirty
   Image Type:   PowerPC Linux Kernel Image (gzip compressed)
   Data Size:    1521505 Bytes =  1.5 MB
   Load Address: 00400000
   Entry Point:  00400458
   Verifying Checksum ... OK
   Uncompressing Kernel Image ... OK
CPU clock-frequency <- 0x27bc86a4 (667MHz)
CPU timebase-frequency <- 0x27bc86a4 (667MHz)
/plb: clock-frequency <- 9ef21a9 (167MHz)
/plb/opb: clock-frequency <- 4f790d4 (83MHz)
/plb/opb/ebc: clock-frequency <- 34fb5e3 (56MHz)
/plb/opb/serial@ef600300: clock-frequency <- a8c000 (11MHz)
/plb/opb/serial@ef600400: clock-frequency <- a8c000 (11MHz)
/plb/opb/serial@ef600500: clock-frequency <- 42ecac (4MHz)
/plb/opb/serial@ef600600: clock-frequency <- 42ecac (4MHz)
Memory <- <0x0 0x0 0xffff000> (255MB)
ethernet0: local-mac-address <- 00:10:ec:00:e2:3e
ethernet1: local-mac-address <- 00:10:ec:80:e2:3e

zImage starting: loaded at 0x00400000 (sp: 0x0fe3c820)
Allocating 0x3d54dc bytes for kernel ...
gunzipping (0x00000000 <- 0x0040e000:0x007b24a4)...done 0x390af8 bytes

Linux/PowerPC load: console=ttyS0,115200 ip=dhcp 
nfsroot=192.168.1.2:/home/paelzer/ubuntu_ppc.8.04 root=/dev/nfs rw 
bootmem_debug
Finalizing device tree... flat tree at 0x40bed8
Using PowerPC 44x Platform machine description
Linux version 2.6.27-dirty (paelzer@HelionPrime) (gcc version 4.2.3) #12 
Wed Oct 22 19:40:49 CEST 2008
console [udbg0] enabled
bootmem::init_bootmem_core nid=0 start=0 map=ffd end=fff mapsize=200
bootmem::mark_bootmem_node nid=0 start=0 end=fff reserve=0 flags=0
bootmem::__free nid=0 start=0 end=fff
bootmem::mark_bootmem_node nid=0 start=0 end=3e reserve=1 flags=0
bootmem::__reserve nid=0 start=0 end=3e flags=0
bootmem::mark_bootmem_node nid=0 start=40 end=41 reserve=1 flags=0
bootmem::__reserve nid=0 start=40 end=41 flags=0
bootmem::mark_bootmem_node nid=0 start=ffd end=fff reserve=1 flags=0
bootmem::__reserve nid=0 start=ffd end=fff flags=0
------------[ cut here ]------------
kernel BUG at mm/bootmem.c:320!
Oops: Exception in kernel mode, sig: 5 [#1]
PowerPC 44x Platform
NIP: c02ce838 LR: c02ca4e4 CTR: c000dcf8
REGS: c0361eb0 TRAP: 0700   Not tainted  (2.6.27-dirty)
MSR: 00021000 <ME>  CR: 22004022  XER: 0000005f
TASK = c03304a8[0] 'swapper' THREAD: c0360000
GPR00: c02e0c98 c0361f60 c03304a8 00000fff 00001000 00000001 00000000 
00004000
GPR08: e0000000 00000000 ffffffff c02e0c90 22000024 00000000 0ffa6800 
0ffbf000
GPR16: 100c0000 00000000 100c0000 00000000 0ffa7500 0fe3cb20 00000001 
c02e0c98
GPR24: 00000000 00000001 00001000 00000fff c03a0000 00000fff c03ad1e0 
c02e0c84
NIP [c02ce838] mark_bootmem+0xe0/0x124
LR [c02ca4e4] do_init_bootmem+0x134/0x168
Call Trace:
[c0361f60] [c02ce810] mark_bootmem+0xb8/0x124 (unreliable)
[c0361f90] [c02ca4e4] do_init_bootmem+0x134/0x168
[c0361fb0] [c02c8e00] setup_arch+0x13c/0x1b8
[c0361fc0] [c02c066c] start_kernel+0x94/0x2ac
[c0361ff0] [c00001e8] skpinv+0x190/0x1cc
Instruction dump:
7f07c378 4bfffe15 7c7e1b78 4192000c 2f830000 409e0024 7f9ae000 419e0050
817f0014 83bf0004 3bebffec 4bffff68 <0fe00000> 48000000 7f63db78 7fa4eb78
---[ end trace 31fd0ba7d8756001 ]---
Kernel panic - not syncing: Attempted to kill the idle task!
Rebooting in 180 seconds..

Christian Ehrhardt wrote:
> Hi Ilya,
> I just tried your patch on my 440 board because it would help us in 
> our environment.
> Unfortunately I run into a bug on early boot (mark_bootmem).
>
> A log can be found in this mail, this is the bug when running with 64k 
> page size.
> I tried this with and without your 2/2 265k patch and also with page 
> size configured to 16k, the error is the same in all cases.
>
> I used an earlier version of your patch in the past and it worked 
> fine. Applying this old patch causes the same problem.
> Therefore I expect that there was some other code changed that breaks 
> with page size != 4k.
>
> I did not check that in detail yet, but I would be happy for every 
> hint I could get to fix this.
>
> => bootm
> ## Booting kernel from Legacy Image at 04000000 ...
>   Image Name:   Linux-2.6.27-dirty
>   Image Type:   PowerPC Linux Kernel Image (gzip compressed)
>   Data Size:    1512203 Bytes =  1.4 MB
>   Load Address: 00400000
>   Entry Point:  00400458
>   Verifying Checksum ... OK
>   Uncompressing Kernel Image ... OK
> CPU clock-frequency <- 0x27bc86a4 (667MHz)
> CPU timebase-frequency <- 0x27bc86a4 (667MHz)
> /plb: clock-frequency <- 9ef21a9 (167MHz)
> /plb/opb: clock-frequency <- 4f790d4 (83MHz)
> /plb/opb/ebc: clock-frequency <- 34fb5e3 (56MHz)
> /plb/opb/serial@ef600300: clock-frequency <- a8c000 (11MHz)
> /plb/opb/serial@ef600400: clock-frequency <- a8c000 (11MHz)
> /plb/opb/serial@ef600500: clock-frequency <- 42ecac (4MHz)
> /plb/opb/serial@ef600600: clock-frequency <- 42ecac (4MHz)
> Memory <- <0x0 0x0 0xffff000> (255MB)
> ethernet0: local-mac-address <- 00:10:ec:00:e2:3e
> ethernet1: local-mac-address <- 00:10:ec:80:e2:3e
>
> zImage starting: loaded at 0x00400000 (sp: 0x0fe3c820)
> Allocating 0x3c54dc bytes for kernel ...
> gunzipping (0x00000000 <- 0x0040e000:0x007a2428)...done 0x380a90 bytes
>
> Linux/PowerPC load: console=ttyS0,115200 ip=dhcp 
> nfsroot=192.168.1.2:/home/paelzer/ubuntu_ppc.8.04 root=/dev/nfs rw
> Finalizing device tree... flat tree at 0x40bed8
> Using PowerPC 44x Platform machine description
> Linux version 2.6.27-dirty (paelzer@HelionPrime) (gcc version 4.2.3) 
> #5 Wed Oct 22 15:15:40 CEST 2008
> console [udbg0] enabled
> ------------[ cut here ]------------
> Kernel BUG at c02be6cc [verbose debug info unavailable]
> Oops: Exception in kernel mode, sig: 5 [#1]
> PowerPC 44x Platform
> NIP: c02be6cc LR: c02ba4e4 CTR: 00000000
> REGS: c0351eb0 TRAP: 0700   Not tainted  (2.6.27-dirty)
> MSR: 00021000 <ME>  CR: 22004022  XER: 0000005f
> TASK = c03204a8[0] 'swapper' THREAD: c0350000
> GPR00: c02d0a1c c0351f60 c03204a8 00000fff 00001000 00000001 00000000 
> 00000000
> GPR08: e0000000 00000000 ffffffff c02d0a14 22000024 00000000 0ffa6800 
> 0ffbf000
> GPR16: c02ed838 bfe8f45c 00000000 00000000 0ffa7500 0fe3cb20 00000001 
> c02d0a1c
> GPR24: 00000000 00000001 00001000 00000fff c0390000 00000fff c039d1d0 
> c02d0a08
> NIP [c02be6cc] mark_bootmem+0xe0/0x124
> LR [c02ba4e4] do_init_bootmem+0x134/0x168
> Call Trace:
> [c0351f60] [c02be6a4] mark_bootmem+0xb8/0x124 (unreliable)
> [c0351f90] [c02ba4e4] do_init_bootmem+0x134/0x168
> [c0351fb0] [c02b8e00] setup_arch+0x13c/0x1b8
> [c0351fc0] [c02b066c] start_kernel+0x94/0x2ac
> [c0351ff0] [c00001e8] skpinv+0x190/0x1cc
> Instruction dump:
> 7f07c378 4bfffe15 7c7e1b78 4192000c 2f830000 409e0024 7f9ae000 419e0050
> 817f0014 83bf0004 3bebffec 4bffff68 <0fe00000> 48000000 7f63db78 7fa4eb78
> ---[ end trace 31fd0ba7d8756001 ]---
> Kernel panic - not syncing: Attempted to kill the idle task!
> Rebooting in 180 seconds..
>
>
> Ilya Yanok wrote:
>> This patch adds support for page sizes bigger than 4K (16K/64K) on
>> PPC 44x.
>>
>> Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
>> Signed-off-by: Vladimir Panfilov <pvr@emcraft.com>
>> Signed-off-by: Ilya Yanok <yanok@emcraft.com>
>> ---
>>  arch/powerpc/Kconfig                   |   26 
>> ++++++++++++++++++++------
>>  arch/powerpc/include/asm/highmem.h     |    8 +++++++-
>>  arch/powerpc/include/asm/mmu-44x.h     |   18 ++++++++++++++++++
>>  arch/powerpc/include/asm/page.h        |   13 ++++++++-----
>>  arch/powerpc/include/asm/pgtable.h     |    3 +++
>>  arch/powerpc/kernel/asm-offsets.c      |    4 ++++
>>  arch/powerpc/kernel/head_44x.S         |   22 +++++++++++++---------
>>  arch/powerpc/kernel/misc_32.S          |   12 ++++++------
>>  arch/powerpc/mm/pgtable_32.c           |    9 ++-------
>>  arch/powerpc/platforms/Kconfig.cputype |    2 +-
>>  10 files changed, 82 insertions(+), 35 deletions(-)
>>
>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>> index 587da5e..9627cfd 100644
>> --- a/arch/powerpc/Kconfig
>> +++ b/arch/powerpc/Kconfig
>> @@ -402,16 +402,30 @@ config PPC_HAS_HASH_64K
>>      depends on PPC64
>>      default n
>>
>> -config PPC_64K_PAGES
>> -    bool "64k page size"
>> -    depends on PPC64
>> -    select PPC_HAS_HASH_64K
>> +choice
>> +    prompt "Page size"
>> +    default PPC_4K_PAGES
>>      help
>> -      This option changes the kernel logical page size to 64k. On 
>> machines
>> +      The PAGE_SIZE definition. Increasing the page size may
>> +      improve the system performance in some dedicated cases like 
>> software
>> +      RAID with accelerated calculations. In PPC64 case on machines
>>        without processor support for 64k pages, the kernel will simulate
>>        them by loading each individual 4k page on demand transparently,
>>        while on hardware with such support, it will be used to map
>>        normal application pages.
>> +      If unsure, set it to 4 KB.
>> +
>> +config PPC_4K_PAGES
>> +    bool "4k page size"
>> +
>> +config PPC_16K_PAGES
>> +    bool "16k page size" if 44x
>> +
>> +config PPC_64K_PAGES
>> +    bool "64k page size" if 44x || PPC64
>> +    select PPC_HAS_HASH_64K if PPC64
>> +
>> +endchoice
>>
>>  config FORCE_MAX_ZONEORDER
>>      int "Maximum zone order"
>> @@ -435,7 +449,7 @@ config FORCE_MAX_ZONEORDER
>>
>>  config PPC_SUBPAGE_PROT
>>      bool "Support setting protections for 4k subpages"
>> -    depends on PPC_64K_PAGES
>> +    depends on PPC64 && PPC_64K_PAGES
>>      help
>>        This option adds support for a system call to allow user programs
>>        to set access permissions (read/write, readonly, or no access)
>> diff --git a/arch/powerpc/include/asm/highmem.h 
>> b/arch/powerpc/include/asm/highmem.h
>> index 5d99b64..dc1132c 100644
>> --- a/arch/powerpc/include/asm/highmem.h
>> +++ b/arch/powerpc/include/asm/highmem.h
>> @@ -38,9 +38,15 @@ extern pte_t *pkmap_page_table;
>>   * easily, subsequent pte tables have to be allocated in one physical
>>   * chunk of RAM.
>>   */
>> +#if defined(CONFIG_PPC_64K_PAGES) && !defined(CONFIG_PPC64)
>> +#define PKMAP_ORDER    (27 - PAGE_SHIFT)
>> +#define LAST_PKMAP    (1 << PKMAP_ORDER)
>> +#define PKMAP_BASE    (FIXADDR_START - PAGE_SIZE*(LAST_PKMAP + 1))
>> +#else
>>  #define LAST_PKMAP     (1 << PTE_SHIFT)
>> -#define LAST_PKMAP_MASK (LAST_PKMAP-1)
>>  #define PKMAP_BASE    ((FIXADDR_START - PAGE_SIZE*(LAST_PKMAP + 1)) 
>> & PMD_MASK)
>> +#endif
>> +#define LAST_PKMAP_MASK    (LAST_PKMAP-1)
>>  #define PKMAP_NR(virt)  ((virt-PKMAP_BASE) >> PAGE_SHIFT)
>>  #define PKMAP_ADDR(nr)  (PKMAP_BASE + ((nr) << PAGE_SHIFT))
>>
>> diff --git a/arch/powerpc/include/asm/mmu-44x.h 
>> b/arch/powerpc/include/asm/mmu-44x.h
>> index a825524..2ca18e8 100644
>> --- a/arch/powerpc/include/asm/mmu-44x.h
>> +++ b/arch/powerpc/include/asm/mmu-44x.h
>> @@ -4,6 +4,8 @@
>>   * PPC440 support
>>   */
>>
>> +#include <asm/page.h>
>> +
>>  #define PPC44x_MMUCR_TID    0x000000ff
>>  #define PPC44x_MMUCR_STS    0x00010000
>>
>> @@ -73,4 +75,20 @@ typedef struct {
>>  /* Size of the TLBs used for pinning in lowmem */
>>  #define PPC_PIN_SIZE    (1 << 28)    /* 256M */
>>
>> +#if (PAGE_SHIFT == 12)
>> +#define PPC44x_TLBE_SIZE    PPC44x_TLB_4K
>> +#elif (PAGE_SHIFT == 14)
>> +#define PPC44x_TLBE_SIZE    PPC44x_TLB_16K
>> +#elif (PAGE_SHIFT == 16)
>> +#define PPC44x_TLBE_SIZE    PPC44x_TLB_64K
>> +#else
>> +#error "Unsupported PAGE_SIZE"
>> +#endif
>> +
>> +#define PPC44x_PGD_OFF_SHIFT    (32 - PMD_SHIFT + 2)
>> +#define PPC44x_PGD_OFF_MASK    (PMD_SHIFT - 2)
>> +#define PPC44x_PTE_ADD_SHIFT    (32 - PMD_SHIFT + PTE_SHIFT + 3)
>> +#define PPC44x_PTE_ADD_MASK    (32 - 3 - PTE_SHIFT)
>> +#define PPC44x_RPN_MASK        (31 - PAGE_SHIFT)
>> +
>>  #endif /* _ASM_POWERPC_MMU_44X_H_ */
>> diff --git a/arch/powerpc/include/asm/page.h 
>> b/arch/powerpc/include/asm/page.h
>> index e088545..537d5b1 100644
>> --- a/arch/powerpc/include/asm/page.h
>> +++ b/arch/powerpc/include/asm/page.h
>> @@ -15,12 +15,15 @@
>>  #include <asm/types.h>
>>
>>  /*
>> - * On PPC32 page size is 4K. For PPC64 we support either 4K or 64K 
>> software
>> + * On regular PPC32 page size is 4K (but we support 4K/16K/64K pages
>> + * on PPC44x). For PPC64 we support either 4K or 64K software
>>   * page size. When using 64K pages however, whether we are really 
>> supporting
>>   * 64K pages in HW or not is irrelevant to those definitions.
>>   */
>> -#ifdef CONFIG_PPC_64K_PAGES
>> +#if defined(CONFIG_PPC_64K_PAGES)
>>  #define PAGE_SHIFT        16
>> +#elif defined(CONFIG_PPC_16K_PAGES)
>> +#define PAGE_SHIFT        14
>>  #else
>>  #define PAGE_SHIFT        12
>>  #endif
>> @@ -140,7 +143,7 @@ typedef struct { pte_basic_t pte; } pte_t;
>>  /* 64k pages additionally define a bigger "real PTE" type that gathers
>>   * the "second half" part of the PTE for pseudo 64k pages
>>   */
>> -#ifdef CONFIG_PPC_64K_PAGES
>> +#if defined(CONFIG_PPC_64K_PAGES) && defined(CONFIG_PPC64)
>>  typedef struct { pte_t pte; unsigned long hidx; } real_pte_t;
>>  #else
>>  typedef struct { pte_t pte; } real_pte_t;
>> @@ -180,10 +183,10 @@ typedef pte_basic_t pte_t;
>>  #define pte_val(x)    (x)
>>  #define __pte(x)    (x)
>>
>> -#ifdef CONFIG_PPC_64K_PAGES
>> +#if defined(CONFIG_PPC_64K_PAGES) && defined(CONFIG_PPC64)
>>  typedef struct { pte_t pte; unsigned long hidx; } real_pte_t;
>>  #else
>> -typedef unsigned long real_pte_t;
>> +typedef pte_t real_pte_t;
>>  #endif
>>
>>
>> diff --git a/arch/powerpc/include/asm/pgtable.h 
>> b/arch/powerpc/include/asm/pgtable.h
>> index dbb8ca1..0d447fb 100644
>> --- a/arch/powerpc/include/asm/pgtable.h
>> +++ b/arch/powerpc/include/asm/pgtable.h
>> @@ -39,6 +39,9 @@ extern void paging_init(void);
>>
>>  #include <asm-generic/pgtable.h>
>>
>> +#define PGD_T_LOG2    (__builtin_ffs(sizeof(pgd_t)) - 1)
>> +#define PMD_T_LOG2    (__builtin_ffs(sizeof(pmd_t)) - 1)
>> +#define PTE_T_LOG2    (__builtin_ffs(sizeof(pte_t)) - 1)
>>
>>  /*
>>   * This gets called at the end of handling a page fault, when
>> diff --git a/arch/powerpc/kernel/asm-offsets.c 
>> b/arch/powerpc/kernel/asm-offsets.c
>> index 92768d3..98b8bb6 100644
>> --- a/arch/powerpc/kernel/asm-offsets.c
>> +++ b/arch/powerpc/kernel/asm-offsets.c
>> @@ -375,6 +375,10 @@ int main(void)
>>      DEFINE(VCPU_FAULT_DEAR, offsetof(struct kvm_vcpu, 
>> arch.fault_dear));
>>      DEFINE(VCPU_FAULT_ESR, offsetof(struct kvm_vcpu, arch.fault_esr));
>>  #endif
>> +#ifdef CONFIG_44x
>> +    DEFINE(PMD_SHIFT, PMD_SHIFT);
>> +    DEFINE(PTE_SHIFT, PTE_SHIFT);
>> +#endif
>>
>>      return 0;
>>  }
>> diff --git a/arch/powerpc/kernel/head_44x.S 
>> b/arch/powerpc/kernel/head_44x.S
>> index f3a1ea9..6525124 100644
>> --- a/arch/powerpc/kernel/head_44x.S
>> +++ b/arch/powerpc/kernel/head_44x.S
>> @@ -391,12 +391,14 @@ interrupt_base:
>>      rlwimi    r13,r12,10,30,30
>>
>>      /* Load the PTE */
>> -    rlwinm     r12, r10, 13, 19, 29    /* Compute pgdir/pmd offset */
>> +    /* Compute pgdir/pmd offset */
>> +    rlwinm  r12, r10, PPC44x_PGD_OFF_SHIFT, PPC44x_PGD_OFF_MASK, 29
>>      lwzx    r11, r12, r11        /* Get pgd/pmd entry */
>>      rlwinm.    r12, r11, 0, 0, 20    /* Extract pt base address */
>>      beq    2f            /* Bail if no table */
>>
>> -    rlwimi    r12, r10, 23, 20, 28    /* Compute pte address */
>> +    /* Compute pte address */
>> +    rlwimi  r12, r10, PPC44x_PTE_ADD_SHIFT, PPC44x_PTE_ADD_MASK, 28
>>      lwz    r11, 0(r12)        /* Get high word of pte entry */
>>      lwz    r12, 4(r12)        /* Get low word of pte entry */
>>
>> @@ -485,12 +487,14 @@ tlb_44x_patch_hwater_D:
>>      /* Make up the required permissions */
>>      li    r13,_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_HWEXEC
>>
>> -    rlwinm    r12, r10, 13, 19, 29    /* Compute pgdir/pmd offset */
>> +    /* Compute pgdir/pmd offset */
>> +    rlwinm     r12, r10, PPC44x_PGD_OFF_SHIFT, PPC44x_PGD_OFF_MASK, 29
>>      lwzx    r11, r12, r11        /* Get pgd/pmd entry */
>>      rlwinm.    r12, r11, 0, 0, 20    /* Extract pt base address */
>>      beq    2f            /* Bail if no table */
>>
>> -    rlwimi    r12, r10, 23, 20, 28    /* Compute pte address */
>> +    /* Compute pte address */
>> +    rlwimi    r12, r10, PPC44x_PTE_ADD_SHIFT, PPC44x_PTE_ADD_MASK, 28
>>      lwz    r11, 0(r12)        /* Get high word of pte entry */
>>      lwz    r12, 4(r12)        /* Get low word of pte entry */
>>
>> @@ -554,15 +558,15 @@ tlb_44x_patch_hwater_I:
>>   */
>>  finish_tlb_load:
>>      /* Combine RPN & ERPN an write WS 0 */
>> -    rlwimi    r11,r12,0,0,19
>> +    rlwimi    r11,r12,0,0,PPC44x_RPN_MASK
>>      tlbwe    r11,r13,PPC44x_TLB_XLAT
>>
>>      /*
>>       * Create WS1. This is the faulting address (EPN),
>>       * page size, and valid flag.
>>       */
>> -    li    r11,PPC44x_TLB_VALID | PPC44x_TLB_4K
>> -    rlwimi    r10,r11,0,20,31            /* Insert valid and page 
>> size*/
>> +    li    r11,PPC44x_TLB_VALID | PPC44x_TLBE_SIZE
>> +    rlwimi    r10,r11,0,PPC44x_PTE_ADD_MASK,31/* Insert valid and 
>> page size*/
>>      tlbwe    r10,r13,PPC44x_TLB_PAGEID    /* Write PAGEID */
>>
>>      /* And WS 2 */
>> @@ -634,12 +638,12 @@ _GLOBAL(set_context)
>>   * goes at the beginning of the data segment, which is page-aligned.
>>   */
>>      .data
>> -    .align    12
>> +    .align    PAGE_SHIFT
>>      .globl    sdata
>>  sdata:
>>      .globl    empty_zero_page
>>  empty_zero_page:
>> -    .space    4096
>> +    .space    PAGE_SIZE
>>
>>  /*
>>   * To support >32-bit physical addresses, we use an 8KB pgdir.
>> diff --git a/arch/powerpc/kernel/misc_32.S 
>> b/arch/powerpc/kernel/misc_32.S
>> index 7a6dfbc..0110fcd 100644
>> --- a/arch/powerpc/kernel/misc_32.S
>> +++ b/arch/powerpc/kernel/misc_32.S
>> @@ -589,8 +589,8 @@ _GLOBAL(__flush_dcache_icache)
>>  BEGIN_FTR_SECTION
>>      blr
>>  END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
>> -    rlwinm    r3,r3,0,0,19            /* Get page base address */
>> -    li    r4,4096/L1_CACHE_BYTES    /* Number of lines in a page */
>> +    rlwinm    r3,r3,0,0,PPC44x_RPN_MASK    /* Get page base address */
>> +    li    r4,PAGE_SIZE/L1_CACHE_BYTES    /* Number of lines in a 
>> page */
>>      mtctr    r4
>>      mr    r6,r3
>>  0:    dcbst    0,r3                /* Write line to ram */
>> @@ -630,8 +630,8 @@ END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
>>      rlwinm    r0,r10,0,28,26            /* clear DR */
>>      mtmsr    r0
>>      isync
>> -    rlwinm    r3,r3,0,0,19            /* Get page base address */
>> -    li    r4,4096/L1_CACHE_BYTES    /* Number of lines in a page */
>> +    rlwinm    r3,r3,0,0,PPC44x_RPN_MASK    /* Get page base address */
>> +    li    r4,PAGE_SIZE/L1_CACHE_BYTES    /* Number of lines in a 
>> page */
>>      mtctr    r4
>>      mr    r6,r3
>>  0:    dcbst    0,r3                /* Write line to ram */
>> @@ -655,7 +655,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
>>   * void clear_pages(void *page, int order) ;
>>   */
>>  _GLOBAL(clear_pages)
>> -    li    r0,4096/L1_CACHE_BYTES
>> +    li    r0,PAGE_SIZE/L1_CACHE_BYTES
>>      slw    r0,r0,r4
>>      mtctr    r0
>>  #ifdef CONFIG_8xx
>> @@ -713,7 +713,7 @@ _GLOBAL(copy_page)
>>      dcbt    r5,r4
>>      li    r11,L1_CACHE_BYTES+4
>>  #endif /* MAX_COPY_PREFETCH */
>> -    li    r0,4096/L1_CACHE_BYTES - MAX_COPY_PREFETCH
>> +    li    r0,PAGE_SIZE/L1_CACHE_BYTES - MAX_COPY_PREFETCH
>>      crclr    4*cr0+eq
>>  2:
>>      mtctr    r0
>> diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
>> index 2001abd..4eed001 100644
>> --- a/arch/powerpc/mm/pgtable_32.c
>> +++ b/arch/powerpc/mm/pgtable_32.c
>> @@ -72,12 +72,7 @@ extern unsigned long p_mapped_by_tlbcam(unsigned 
>> long pa);
>>  #define p_mapped_by_tlbcam(x)    (0UL)
>>  #endif /* HAVE_TLBCAM */
>>
>> -#ifdef CONFIG_PTE_64BIT
>> -/* 44x uses an 8kB pgdir because it has 8-byte Linux PTEs. */
>> -#define PGDIR_ORDER    1
>> -#else
>> -#define PGDIR_ORDER    0
>> -#endif
>> +#define PGDIR_ORDER    max(32 + PGD_T_LOG2 - PGDIR_SHIFT - 
>> PAGE_SHIFT, 0)
>>
>>  pgd_t *pgd_alloc(struct mm_struct *mm)
>>  {
>> @@ -400,7 +395,7 @@ void kernel_map_pages(struct page *page, int 
>> numpages, int enable)
>>  #endif /* CONFIG_DEBUG_PAGEALLOC */
>>
>>  static int fixmaps;
>> -unsigned long FIXADDR_TOP = 0xfffff000;
>> +unsigned long FIXADDR_TOP = (-PAGE_SIZE);
>>  EXPORT_SYMBOL(FIXADDR_TOP);
>>
>>  void __set_fixmap (enum fixed_addresses idx, phys_addr_t phys, 
>> pgprot_t flags)
>> diff --git a/arch/powerpc/platforms/Kconfig.cputype 
>> b/arch/powerpc/platforms/Kconfig.cputype
>> index 7f65127..a1386a4 100644
>> --- a/arch/powerpc/platforms/Kconfig.cputype
>> +++ b/arch/powerpc/platforms/Kconfig.cputype
>> @@ -202,7 +202,7 @@ config PPC_STD_MMU_32
>>
>>  config PPC_MM_SLICES
>>      bool
>> -    default y if HUGETLB_PAGE || PPC_64K_PAGES
>> +    default y if HUGETLB_PAGE || (PPC64 && PPC_64K_PAGES)
>>      default n
>>
>>  config VIRT_CPU_ACCOUNTING
>>   
>
>
Milton Miller Nov. 10, 2008, 3:09 p.m. UTC | #6
On 2008-10-16 at 02:22:31, Ilya Yanok wrote:

I started out looking at the too minimal decription of patch 2/2, and 
that morphed into talking about both patches.

> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 587da5e..9627cfd 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -402,16 +402,30 @@  config PPC_HAS_HASH_64K
>         depends on PPC64
>         default n
>
> -config PPC_64K_PAGES
> -       bool "64k page size"
> -       depends on PPC64
> -       select PPC_HAS_HASH_64K
> +choice
> +       prompt "Page size"
> +       default PPC_4K_PAGES
>         help
> -         This option changes the kernel logical page size to 64k. On 
> machines
> +         The PAGE_SIZE definition. Increasing the page size may
> +         improve the system performance in some dedicated cases like 
> software
> +         RAID with accelerated calculations. In PPC64 case on machines
>           without processor support for 64k pages, the kernel will 
> simulate
>           them by loading each individual 4k page on demand 
> transparently,
>           while on hardware with such support, it will be used to map
>           normal application pages.
> +         If unsure, set it to 4 KB.
> +

This is less understandable (more hacker jargon) and too application 
specific.  (Josh, since this is cross-sub-platform we need to make sure 
this fragment gets proper review).

Also, we need to check the help placement, as I seem to remember the 
config programs looking at the first choice instead of the choice tag.  
Or should the help be split by option?

Lets try this

Select the kernel logical page size.   Increasing the page size will 
reduce software overhead at each page boundary, allow hardware prefetch 
mechanisms to be more effective, and allow larger dma transfers 
increasing IO efficiency and reducing overhead.  However the 
utilization of memory will increase.  For example, each cached file 
will using a multiple of the page size to hold its contents and the 
difference between the end of file and the end of page is wasted.

Some dedicated systems, such as software raid serving with accelerated 
calculations, have shown significant increases.

If you configure a 64 bit kernel for 64k pages but the processor does 
not support them, then the kernel will simulate them with 4k pages, 
loading them on demand, but with the reduced software overhead and 
larger internal fragmentation.  For the 32 bit kernel, a large page 
option will not be offered unless it is supported by the configured 
processor.

If unsure, choose 4K_PAGES.


> +config PPC_4K_PAGES
> +       bool "4k page size"
> +
> +config PPC_16K_PAGES
> +       bool "16k page size" if 44x
> +
> +config PPC_64K_PAGES
> +       bool "64k page size" if 44x || PPC64
> +       select PPC_HAS_HASH_64K if PPC64
> +
> +endchoice
>


> diff --git a/arch/powerpc/include/asm/highmem.h 
> b/arch/powerpc/include/asm/highmem.h
> index 5d99b64..dc1132c 100644
> --- a/arch/powerpc/include/asm/highmem.h
> +++ b/arch/powerpc/include/asm/highmem.h
> @@ -38,9 +38,15 @@  extern pte_t *pkmap_page_table;
>   * easily, subsequent pte tables have to be allocated in one physical
>   * chunk of RAM.
>   */
> +#if defined(CONFIG_PPC_64K_PAGES) && !defined(CONFIG_PPC64)

In patch 2/2 I was going to comment about the precedence of PPC64 vs 
64K_PAGES, but then I realized this file is only included when 
CONFIG_HIGHMEM is set and that depends on PPC32 , so it will never be 
set.   Please remove the additional noise && !defined(CONFIG_PPC64).

> +#define PKMAP_ORDER    (27 - PAGE_SHIFT)
where did the value 27 come from?

> +#define LAST_PKMAP     (1 << PKMAP_ORDER)
> +#define PKMAP_BASE     (FIXADDR_START - PAGE_SIZE*(LAST_PKMAP + 1))
> +#else
>  #define LAST_PKMAP     (1 << PTE_SHIFT)
> -#define LAST_PKMAP_MASK (LAST_PKMAP-1)
>  #define PKMAP_BASE     ((FIXADDR_START - PAGE_SIZE*(LAST_PKMAP + 1)) 
> & PMD_MASK)
> +#endif
> +#define LAST_PKMAP_MASK        (LAST_PKMAP-1)

and why not set PKMAP_ORDER on both sides of the else, keepign 
LAST_PKMAP common?

>  #define PKMAP_NR(virt)  ((virt-PKMAP_BASE) >> PAGE_SHIFT)
>  #define PKMAP_ADDR(nr)  (PKMAP_BASE + ((nr) << PAGE_SHIFT))
>
>


> diff --git a/arch/powerpc/include/asm/pgtable.h 
> b/arch/powerpc/include/asm/pgtable.h
> index dbb8ca1..0d447fb 100644
> --- a/arch/powerpc/include/asm/pgtable.h
> +++ b/arch/powerpc/include/asm/pgtable.h
> @@ -39,6 +39,9 @@  extern void paging_init(void);
>
>  #include <asm-generic/pgtable.h>
>
> +#define PGD_T_LOG2     (__builtin_ffs(sizeof(pgd_t)) - 1)
> +#define PMD_T_LOG2     (__builtin_ffs(sizeof(pmd_t)) - 1)
> +#define PTE_T_LOG2     (__builtin_ffs(sizeof(pte_t)) - 1)
>

> diff --git a/arch/powerpc/include/asm/mmu-44x.h 
> b/arch/powerpc/include/asm/mmu-44x.h
> index a825524..2ca18e8 100644
> --- a/arch/powerpc/include/asm/mmu-44x.h
> +++ b/arch/powerpc/include/asm/mmu-44x.h

> +#define PPC44x_PGD_OFF_SHIFT   (32 - PMD_SHIFT + 2)
> +#define PPC44x_PGD_OFF_MASK    (PMD_SHIFT - 2)
> +#define PPC44x_PTE_ADD_SHIFT   (32 - PMD_SHIFT + PTE_SHIFT + 3)
> +#define PPC44x_PTE_ADD_MASK    (32 - 3 - PTE_SHIFT)
> +#define PPC44x_RPN_MASK                (31 - PAGE_SHIFT)
> +

Are the values 2 and 3 related to the new defines PG*_T_LOG2 ?

milton
Ilya Yanok Nov. 10, 2008, 4:50 p.m. UTC | #7
Hello Milton,

Milton Miller wrote:
> I started out looking at the too minimal decription of patch 2/2, and
> that morphed into talking about both patches.
>
>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>> index 587da5e..9627cfd 100644
>> --- a/arch/powerpc/Kconfig
>> +++ b/arch/powerpc/Kconfig
>> @@ -402,16 +402,30 @@  config PPC_HAS_HASH_64K
>>         depends on PPC64
>>         default n
>>
>> -config PPC_64K_PAGES
>> -       bool "64k page size"
>> -       depends on PPC64
>> -       select PPC_HAS_HASH_64K
>> +choice
>> +       prompt "Page size"
>> +       default PPC_4K_PAGES
>>         help
>> -         This option changes the kernel logical page size to 64k. On
>> machines
>> +         The PAGE_SIZE definition. Increasing the page size may
>> +         improve the system performance in some dedicated cases like
>> software
>> +         RAID with accelerated calculations. In PPC64 case on machines
>>           without processor support for 64k pages, the kernel will
>> simulate
>>           them by loading each individual 4k page on demand
>> transparently,
>>           while on hardware with such support, it will be used to map
>>           normal application pages.
>> +         If unsure, set it to 4 KB.
>> +
>
> This is less understandable (more hacker jargon) and too application
> specific.  (Josh, since this is cross-sub-platform we need to make
> sure this fragment gets proper review).
>
> Also, we need to check the help placement, as I seem to remember the
> config programs looking at the first choice instead of the choice
> tag.  Or should the help be split by option?

Help at the choice tag works properly.

> Lets try this
>
> Select the kernel logical page size.   Increasing the page size will
> reduce software overhead at each page boundary, allow hardware
> prefetch mechanisms to be more effective, and allow larger dma
> transfers increasing IO efficiency and reducing overhead.  However the
> utilization of memory will increase.  For example, each cached file
> will using a multiple of the page size to hold its contents and the
> difference between the end of file and the end of page is wasted.
>
> Some dedicated systems, such as software raid serving with accelerated
> calculations, have shown significant increases.
>
> If you configure a 64 bit kernel for 64k pages but the processor does
> not support them, then the kernel will simulate them with 4k pages,
> loading them on demand, but with the reduced software overhead and
> larger internal fragmentation.  For the 32 bit kernel, a large page
> option will not be offered unless it is supported by the configured
> processor.
>
> If unsure, choose 4K_PAGES.

This looks much better for me. I'll include this help message in updated
patch.

>> +config PPC_4K_PAGES
>> +       bool "4k page size"
>> +
>> +config PPC_16K_PAGES
>> +       bool "16k page size" if 44x
>> +
>> +config PPC_64K_PAGES
>> +       bool "64k page size" if 44x || PPC64
>> +       select PPC_HAS_HASH_64K if PPC64
>> +
>> +endchoice
>>
>
>
>> diff --git a/arch/powerpc/include/asm/highmem.h
>> b/arch/powerpc/include/asm/highmem.h
>> index 5d99b64..dc1132c 100644
>> --- a/arch/powerpc/include/asm/highmem.h
>> +++ b/arch/powerpc/include/asm/highmem.h
>> @@ -38,9 +38,15 @@  extern pte_t *pkmap_page_table;
>>   * easily, subsequent pte tables have to be allocated in one physical
>>   * chunk of RAM.
>>   */
>> +#if defined(CONFIG_PPC_64K_PAGES) && !defined(CONFIG_PPC64)
>
> In patch 2/2 I was going to comment about the precedence of PPC64 vs
> 64K_PAGES, but then I realized this file is only included when
> CONFIG_HIGHMEM is set and that depends on PPC32 , so it will never be
> set.   Please remove the additional noise && !defined(CONFIG_PPC64).

Ok.

>> +#define PKMAP_ORDER    (27 - PAGE_SHIFT)
> where did the value 27 come from?

Hm... It's pretty much experimental. There is the range of values which
gives us a proper virtual memory map (VMALLOC_BEGIN < VMALLOC_END) and I
have no clean idea which one we should use.

>> +#define LAST_PKMAP     (1 << PKMAP_ORDER)
>> +#define PKMAP_BASE     (FIXADDR_START - PAGE_SIZE*(LAST_PKMAP + 1))
>> +#else
>>  #define LAST_PKMAP     (1 << PTE_SHIFT)
>> -#define LAST_PKMAP_MASK (LAST_PKMAP-1)
>>  #define PKMAP_BASE     ((FIXADDR_START - PAGE_SIZE*(LAST_PKMAP + 1))
>> & PMD_MASK)
>> +#endif
>> +#define LAST_PKMAP_MASK        (LAST_PKMAP-1)
>
> and why not set PKMAP_ORDER on both sides of the else, keepign
> LAST_PKMAP common?

We can do this but I can't see much sense here... We still need to
define PKMAP_BASE differently.

>>  #define PKMAP_NR(virt)  ((virt-PKMAP_BASE) >> PAGE_SHIFT)
>>  #define PKMAP_ADDR(nr)  (PKMAP_BASE + ((nr) << PAGE_SHIFT))
>>
>>
>
>
>> diff --git a/arch/powerpc/include/asm/pgtable.h
>> b/arch/powerpc/include/asm/pgtable.h
>> index dbb8ca1..0d447fb 100644
>> --- a/arch/powerpc/include/asm/pgtable.h
>> +++ b/arch/powerpc/include/asm/pgtable.h
>> @@ -39,6 +39,9 @@  extern void paging_init(void);
>>
>>  #include <asm-generic/pgtable.h>
>>
>> +#define PGD_T_LOG2     (__builtin_ffs(sizeof(pgd_t)) - 1)
>> +#define PMD_T_LOG2     (__builtin_ffs(sizeof(pmd_t)) - 1)
>> +#define PTE_T_LOG2     (__builtin_ffs(sizeof(pte_t)) - 1)
>>
>
>> diff --git a/arch/powerpc/include/asm/mmu-44x.h
>> b/arch/powerpc/include/asm/mmu-44x.h
>> index a825524..2ca18e8 100644
>> --- a/arch/powerpc/include/asm/mmu-44x.h
>> +++ b/arch/powerpc/include/asm/mmu-44x.h
>
>> +#define PPC44x_PGD_OFF_SHIFT   (32 - PMD_SHIFT + 2)
>> +#define PPC44x_PGD_OFF_MASK    (PMD_SHIFT - 2)
>> +#define PPC44x_PTE_ADD_SHIFT   (32 - PMD_SHIFT + PTE_SHIFT + 3)
>> +#define PPC44x_PTE_ADD_MASK    (32 - 3 - PTE_SHIFT)
>> +#define PPC44x_RPN_MASK                (31 - PAGE_SHIFT)
>> +
>
> Are the values 2 and 3 related to the new defines PG*_T_LOG2 ?

Looks like you are right.

Thanks for your comments.

Regards, Ilya.
diff mbox

Patch

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 587da5e..9627cfd 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -402,16 +402,30 @@  config PPC_HAS_HASH_64K
 	depends on PPC64
 	default n
 
-config PPC_64K_PAGES
-	bool "64k page size"
-	depends on PPC64
-	select PPC_HAS_HASH_64K
+choice
+	prompt "Page size"
+	default PPC_4K_PAGES
 	help
-	  This option changes the kernel logical page size to 64k. On machines
+	  The PAGE_SIZE definition. Increasing the page size may
+	  improve the system performance in some dedicated cases like software
+	  RAID with accelerated calculations. In PPC64 case on machines
 	  without processor support for 64k pages, the kernel will simulate
 	  them by loading each individual 4k page on demand transparently,
 	  while on hardware with such support, it will be used to map
 	  normal application pages.
+	  If unsure, set it to 4 KB.
+
+config PPC_4K_PAGES
+	bool "4k page size"
+
+config PPC_16K_PAGES
+	bool "16k page size" if 44x
+
+config PPC_64K_PAGES
+	bool "64k page size" if 44x || PPC64
+	select PPC_HAS_HASH_64K if PPC64
+
+endchoice
 
 config FORCE_MAX_ZONEORDER
 	int "Maximum zone order"
@@ -435,7 +449,7 @@  config FORCE_MAX_ZONEORDER
 
 config PPC_SUBPAGE_PROT
 	bool "Support setting protections for 4k subpages"
-	depends on PPC_64K_PAGES
+	depends on PPC64 && PPC_64K_PAGES
 	help
 	  This option adds support for a system call to allow user programs
 	  to set access permissions (read/write, readonly, or no access)
diff --git a/arch/powerpc/include/asm/highmem.h b/arch/powerpc/include/asm/highmem.h
index 5d99b64..dc1132c 100644
--- a/arch/powerpc/include/asm/highmem.h
+++ b/arch/powerpc/include/asm/highmem.h
@@ -38,9 +38,15 @@  extern pte_t *pkmap_page_table;
  * easily, subsequent pte tables have to be allocated in one physical
  * chunk of RAM.
  */
+#if defined(CONFIG_PPC_64K_PAGES) && !defined(CONFIG_PPC64)
+#define PKMAP_ORDER	(27 - PAGE_SHIFT)
+#define LAST_PKMAP	(1 << PKMAP_ORDER)
+#define PKMAP_BASE	(FIXADDR_START - PAGE_SIZE*(LAST_PKMAP + 1))
+#else
 #define LAST_PKMAP 	(1 << PTE_SHIFT)
-#define LAST_PKMAP_MASK (LAST_PKMAP-1)
 #define PKMAP_BASE	((FIXADDR_START - PAGE_SIZE*(LAST_PKMAP + 1)) & PMD_MASK)
+#endif
+#define LAST_PKMAP_MASK	(LAST_PKMAP-1)
 #define PKMAP_NR(virt)  ((virt-PKMAP_BASE) >> PAGE_SHIFT)
 #define PKMAP_ADDR(nr)  (PKMAP_BASE + ((nr) << PAGE_SHIFT))
 
diff --git a/arch/powerpc/include/asm/mmu-44x.h b/arch/powerpc/include/asm/mmu-44x.h
index a825524..2ca18e8 100644
--- a/arch/powerpc/include/asm/mmu-44x.h
+++ b/arch/powerpc/include/asm/mmu-44x.h
@@ -4,6 +4,8 @@ 
  * PPC440 support
  */
 
+#include <asm/page.h>
+
 #define PPC44x_MMUCR_TID	0x000000ff
 #define PPC44x_MMUCR_STS	0x00010000
 
@@ -73,4 +75,20 @@  typedef struct {
 /* Size of the TLBs used for pinning in lowmem */
 #define PPC_PIN_SIZE	(1 << 28)	/* 256M */
 
+#if (PAGE_SHIFT == 12)
+#define PPC44x_TLBE_SIZE	PPC44x_TLB_4K
+#elif (PAGE_SHIFT == 14)
+#define PPC44x_TLBE_SIZE	PPC44x_TLB_16K
+#elif (PAGE_SHIFT == 16)
+#define PPC44x_TLBE_SIZE	PPC44x_TLB_64K
+#else
+#error "Unsupported PAGE_SIZE"
+#endif
+
+#define PPC44x_PGD_OFF_SHIFT	(32 - PMD_SHIFT + 2)
+#define PPC44x_PGD_OFF_MASK	(PMD_SHIFT - 2)
+#define PPC44x_PTE_ADD_SHIFT	(32 - PMD_SHIFT + PTE_SHIFT + 3)
+#define PPC44x_PTE_ADD_MASK	(32 - 3 - PTE_SHIFT)
+#define PPC44x_RPN_MASK		(31 - PAGE_SHIFT)
+
 #endif /* _ASM_POWERPC_MMU_44X_H_ */
diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
index e088545..537d5b1 100644
--- a/arch/powerpc/include/asm/page.h
+++ b/arch/powerpc/include/asm/page.h
@@ -15,12 +15,15 @@ 
 #include <asm/types.h>
 
 /*
- * On PPC32 page size is 4K. For PPC64 we support either 4K or 64K software
+ * On regular PPC32 page size is 4K (but we support 4K/16K/64K pages
+ * on PPC44x). For PPC64 we support either 4K or 64K software
  * page size. When using 64K pages however, whether we are really supporting
  * 64K pages in HW or not is irrelevant to those definitions.
  */
-#ifdef CONFIG_PPC_64K_PAGES
+#if defined(CONFIG_PPC_64K_PAGES)
 #define PAGE_SHIFT		16
+#elif defined(CONFIG_PPC_16K_PAGES)
+#define PAGE_SHIFT		14
 #else
 #define PAGE_SHIFT		12
 #endif
@@ -140,7 +143,7 @@  typedef struct { pte_basic_t pte; } pte_t;
 /* 64k pages additionally define a bigger "real PTE" type that gathers
  * the "second half" part of the PTE for pseudo 64k pages
  */
-#ifdef CONFIG_PPC_64K_PAGES
+#if defined(CONFIG_PPC_64K_PAGES) && defined(CONFIG_PPC64)
 typedef struct { pte_t pte; unsigned long hidx; } real_pte_t;
 #else
 typedef struct { pte_t pte; } real_pte_t;
@@ -180,10 +183,10 @@  typedef pte_basic_t pte_t;
 #define pte_val(x)	(x)
 #define __pte(x)	(x)
 
-#ifdef CONFIG_PPC_64K_PAGES
+#if defined(CONFIG_PPC_64K_PAGES) && defined(CONFIG_PPC64)
 typedef struct { pte_t pte; unsigned long hidx; } real_pte_t;
 #else
-typedef unsigned long real_pte_t;
+typedef pte_t real_pte_t;
 #endif
 
 
diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
index dbb8ca1..0d447fb 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -39,6 +39,9 @@  extern void paging_init(void);
 
 #include <asm-generic/pgtable.h>
 
+#define PGD_T_LOG2	(__builtin_ffs(sizeof(pgd_t)) - 1)
+#define PMD_T_LOG2	(__builtin_ffs(sizeof(pmd_t)) - 1)
+#define PTE_T_LOG2	(__builtin_ffs(sizeof(pte_t)) - 1)
 
 /*
  * This gets called at the end of handling a page fault, when
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 92768d3..98b8bb6 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -375,6 +375,10 @@  int main(void)
 	DEFINE(VCPU_FAULT_DEAR, offsetof(struct kvm_vcpu, arch.fault_dear));
 	DEFINE(VCPU_FAULT_ESR, offsetof(struct kvm_vcpu, arch.fault_esr));
 #endif
+#ifdef CONFIG_44x
+	DEFINE(PMD_SHIFT, PMD_SHIFT);
+	DEFINE(PTE_SHIFT, PTE_SHIFT);
+#endif
 
 	return 0;
 }
diff --git a/arch/powerpc/kernel/head_44x.S b/arch/powerpc/kernel/head_44x.S
index f3a1ea9..6525124 100644
--- a/arch/powerpc/kernel/head_44x.S
+++ b/arch/powerpc/kernel/head_44x.S
@@ -391,12 +391,14 @@  interrupt_base:
 	rlwimi	r13,r12,10,30,30
 
 	/* Load the PTE */
-	rlwinm 	r12, r10, 13, 19, 29	/* Compute pgdir/pmd offset */
+	/* Compute pgdir/pmd offset */
+	rlwinm  r12, r10, PPC44x_PGD_OFF_SHIFT, PPC44x_PGD_OFF_MASK, 29
 	lwzx	r11, r12, r11		/* Get pgd/pmd entry */
 	rlwinm.	r12, r11, 0, 0, 20	/* Extract pt base address */
 	beq	2f			/* Bail if no table */
 
-	rlwimi	r12, r10, 23, 20, 28	/* Compute pte address */
+	/* Compute pte address */
+	rlwimi  r12, r10, PPC44x_PTE_ADD_SHIFT, PPC44x_PTE_ADD_MASK, 28
 	lwz	r11, 0(r12)		/* Get high word of pte entry */
 	lwz	r12, 4(r12)		/* Get low word of pte entry */
 
@@ -485,12 +487,14 @@  tlb_44x_patch_hwater_D:
 	/* Make up the required permissions */
 	li	r13,_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_HWEXEC
 
-	rlwinm	r12, r10, 13, 19, 29	/* Compute pgdir/pmd offset */
+	/* Compute pgdir/pmd offset */
+	rlwinm 	r12, r10, PPC44x_PGD_OFF_SHIFT, PPC44x_PGD_OFF_MASK, 29
 	lwzx	r11, r12, r11		/* Get pgd/pmd entry */
 	rlwinm.	r12, r11, 0, 0, 20	/* Extract pt base address */
 	beq	2f			/* Bail if no table */
 
-	rlwimi	r12, r10, 23, 20, 28	/* Compute pte address */
+	/* Compute pte address */
+	rlwimi	r12, r10, PPC44x_PTE_ADD_SHIFT, PPC44x_PTE_ADD_MASK, 28
 	lwz	r11, 0(r12)		/* Get high word of pte entry */
 	lwz	r12, 4(r12)		/* Get low word of pte entry */
 
@@ -554,15 +558,15 @@  tlb_44x_patch_hwater_I:
  */
 finish_tlb_load:
 	/* Combine RPN & ERPN an write WS 0 */
-	rlwimi	r11,r12,0,0,19
+	rlwimi	r11,r12,0,0,PPC44x_RPN_MASK
 	tlbwe	r11,r13,PPC44x_TLB_XLAT
 
 	/*
 	 * Create WS1. This is the faulting address (EPN),
 	 * page size, and valid flag.
 	 */
-	li	r11,PPC44x_TLB_VALID | PPC44x_TLB_4K
-	rlwimi	r10,r11,0,20,31			/* Insert valid and page size*/
+	li	r11,PPC44x_TLB_VALID | PPC44x_TLBE_SIZE
+	rlwimi	r10,r11,0,PPC44x_PTE_ADD_MASK,31/* Insert valid and page size*/
 	tlbwe	r10,r13,PPC44x_TLB_PAGEID	/* Write PAGEID */
 
 	/* And WS 2 */
@@ -634,12 +638,12 @@  _GLOBAL(set_context)
  * goes at the beginning of the data segment, which is page-aligned.
  */
 	.data
-	.align	12
+	.align	PAGE_SHIFT
 	.globl	sdata
 sdata:
 	.globl	empty_zero_page
 empty_zero_page:
-	.space	4096
+	.space	PAGE_SIZE
 
 /*
  * To support >32-bit physical addresses, we use an 8KB pgdir.
diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index 7a6dfbc..0110fcd 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -589,8 +589,8 @@  _GLOBAL(__flush_dcache_icache)
 BEGIN_FTR_SECTION
 	blr
 END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
-	rlwinm	r3,r3,0,0,19			/* Get page base address */
-	li	r4,4096/L1_CACHE_BYTES	/* Number of lines in a page */
+	rlwinm	r3,r3,0,0,PPC44x_RPN_MASK	/* Get page base address */
+	li	r4,PAGE_SIZE/L1_CACHE_BYTES	/* Number of lines in a page */
 	mtctr	r4
 	mr	r6,r3
 0:	dcbst	0,r3				/* Write line to ram */
@@ -630,8 +630,8 @@  END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
 	rlwinm	r0,r10,0,28,26			/* clear DR */
 	mtmsr	r0
 	isync
-	rlwinm	r3,r3,0,0,19			/* Get page base address */
-	li	r4,4096/L1_CACHE_BYTES	/* Number of lines in a page */
+	rlwinm	r3,r3,0,0,PPC44x_RPN_MASK	/* Get page base address */
+	li	r4,PAGE_SIZE/L1_CACHE_BYTES	/* Number of lines in a page */
 	mtctr	r4
 	mr	r6,r3
 0:	dcbst	0,r3				/* Write line to ram */
@@ -655,7 +655,7 @@  END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
  * void clear_pages(void *page, int order) ;
  */
 _GLOBAL(clear_pages)
-	li	r0,4096/L1_CACHE_BYTES
+	li	r0,PAGE_SIZE/L1_CACHE_BYTES
 	slw	r0,r0,r4
 	mtctr	r0
 #ifdef CONFIG_8xx
@@ -713,7 +713,7 @@  _GLOBAL(copy_page)
 	dcbt	r5,r4
 	li	r11,L1_CACHE_BYTES+4
 #endif /* MAX_COPY_PREFETCH */
-	li	r0,4096/L1_CACHE_BYTES - MAX_COPY_PREFETCH
+	li	r0,PAGE_SIZE/L1_CACHE_BYTES - MAX_COPY_PREFETCH
 	crclr	4*cr0+eq
 2:
 	mtctr	r0
diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
index 2001abd..4eed001 100644
--- a/arch/powerpc/mm/pgtable_32.c
+++ b/arch/powerpc/mm/pgtable_32.c
@@ -72,12 +72,7 @@  extern unsigned long p_mapped_by_tlbcam(unsigned long pa);
 #define p_mapped_by_tlbcam(x)	(0UL)
 #endif /* HAVE_TLBCAM */
 
-#ifdef CONFIG_PTE_64BIT
-/* 44x uses an 8kB pgdir because it has 8-byte Linux PTEs. */
-#define PGDIR_ORDER	1
-#else
-#define PGDIR_ORDER	0
-#endif
+#define PGDIR_ORDER	max(32 + PGD_T_LOG2 - PGDIR_SHIFT - PAGE_SHIFT, 0)
 
 pgd_t *pgd_alloc(struct mm_struct *mm)
 {
@@ -400,7 +395,7 @@  void kernel_map_pages(struct page *page, int numpages, int enable)
 #endif /* CONFIG_DEBUG_PAGEALLOC */
 
 static int fixmaps;
-unsigned long FIXADDR_TOP = 0xfffff000;
+unsigned long FIXADDR_TOP = (-PAGE_SIZE);
 EXPORT_SYMBOL(FIXADDR_TOP);
 
 void __set_fixmap (enum fixed_addresses idx, phys_addr_t phys, pgprot_t flags)
diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/platforms/Kconfig.cputype
index 7f65127..a1386a4 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -202,7 +202,7 @@  config PPC_STD_MMU_32
 
 config PPC_MM_SLICES
 	bool
-	default y if HUGETLB_PAGE || PPC_64K_PAGES
+	default y if HUGETLB_PAGE || (PPC64 && PPC_64K_PAGES)
 	default n
 
 config VIRT_CPU_ACCOUNTING