[v3,12/24] powerpc/mm: use pte helpers in generic code

Message ID 343c844bbc5081d13ee4c9aa27ff3118f607e1cc.1539092112.git.christophe.leroy@c-s.fr
State Accepted
Commit 26973fa5ac0e3b88d0d476caccfc10839b26098b
Headers show
Series
  • ban the use of _PAGE_XXX flags outside platform specific code
Related show

Checks

Context Check Description
snowpatch_ozlabs/checkpatch warning Test checkpatch on branch next
snowpatch_ozlabs/apply_patch success next/apply_patch Successfully applied

Commit Message

Christophe Leroy Oct. 9, 2018, 1:51 p.m.
Get rid of platform specific _PAGE_XXXX in powerpc common code and
use helpers instead.

mm/dump_linuxpagetables.c will be handled separately

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
 arch/powerpc/include/asm/book3s/32/pgtable.h |  9 +++------
 arch/powerpc/include/asm/nohash/32/pgtable.h | 12 ++++++++----
 arch/powerpc/include/asm/nohash/pgtable.h    |  3 +--
 arch/powerpc/mm/pgtable.c                    | 21 +++++++--------------
 arch/powerpc/mm/pgtable_32.c                 | 15 ++++++++-------
 arch/powerpc/mm/pgtable_64.c                 | 14 +++++++-------
 arch/powerpc/xmon/xmon.c                     | 12 +++++++-----
 7 files changed, 41 insertions(+), 45 deletions(-)

Comments

Michael Ellerman Oct. 17, 2018, 12:59 a.m. | #1
Christophe Leroy <christophe.leroy@c-s.fr> writes:

> Get rid of platform specific _PAGE_XXXX in powerpc common code and
> use helpers instead.
>
> mm/dump_linuxpagetables.c will be handled separately
>
> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> ---
>  arch/powerpc/include/asm/book3s/32/pgtable.h |  9 +++------
>  arch/powerpc/include/asm/nohash/32/pgtable.h | 12 ++++++++----
>  arch/powerpc/include/asm/nohash/pgtable.h    |  3 +--
>  arch/powerpc/mm/pgtable.c                    | 21 +++++++--------------
>  arch/powerpc/mm/pgtable_32.c                 | 15 ++++++++-------
>  arch/powerpc/mm/pgtable_64.c                 | 14 +++++++-------
>  arch/powerpc/xmon/xmon.c                     | 12 +++++++-----
>  7 files changed, 41 insertions(+), 45 deletions(-)

So turns out this patch *also* breaks my p5020ds :)

Even with patch 4 merged, see next.

It's the same crash:

  pcieport 2000:00:00.0: AER enabled with IRQ 480
  Unable to handle kernel paging request for data at address 0x8000080080080000
  Faulting instruction address: 0xc0000000000192cc
  Oops: Kernel access of bad area, sig: 11 [#1]
  BE SMP NR_CPUS=24 CoreNet Generic
  Modules linked in:
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc3-gcc7x-g98c847323b3a #1
  NIP:  c0000000000192cc LR: c0000000005d0f9c CTR: 0000000000100000
  REGS: c0000000f31bb400 TRAP: 0300   Not tainted  (4.19.0-rc3-gcc7x-g98c847323b3a)
  MSR:  0000000080029000 <CE,EE,ME>  CR: 24000224  XER: 00000000
  DEAR: 8000080080080000 ESR: 0000000000800000 IRQMASK: 0 
  GPR00: c0000000005d0f84 c0000000f31bb688 c00000000117dc00 8000080080080000 
  GPR04: 0000000000000000 0000000000400000 00000ffbff241010 c0000000f31b8000 
  GPR08: 0000000000000000 0000000000100000 0000000000000000 c0000000012d4710 
  GPR12: 0000000084000422 c0000000012ff000 c000000000002774 0000000000000000 
  GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
  GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
  GPR24: 0000000000000000 0000000000000000 8000080080080000 c0000000ffff89a8 
  GPR28: c0000000f3576400 c0000000f3576410 0000000000400000 c0000000012ecc98 
  NIP [c0000000000192cc] ._memset_io+0x6c/0x9c
  LR [c0000000005d0f9c] .fsl_qman_probe+0x198/0x928
  Call Trace:
  [c0000000f31bb688] [c0000000005d0f84] .fsl_qman_probe+0x180/0x928 (unreliable)
  [c0000000f31bb728] [c0000000006432ec] .platform_drv_probe+0x60/0xb4
  [c0000000f31bb7a8] [c00000000064083c] .really_probe+0x294/0x35c
  [c0000000f31bb848] [c000000000640d2c] .__driver_attach+0x148/0x14c
  [c0000000f31bb8d8] [c00000000063d7dc] .bus_for_each_dev+0xb0/0x118
  [c0000000f31bb988] [c00000000063ff28] .driver_attach+0x34/0x4c
  [c0000000f31bba08] [c00000000063f648] .bus_add_driver+0x174/0x2bc
  [c0000000f31bbaa8] [c0000000006418bc] .driver_register+0x90/0x180
  [c0000000f31bbb28] [c000000000643270] .__platform_driver_register+0x60/0x7c
  [c0000000f31bbba8] [c000000000ee2a70] .fsl_qman_driver_init+0x24/0x38
  [c0000000f31bbc18] [c0000000000023fc] .do_one_initcall+0x64/0x2b8
  [c0000000f31bbcf8] [c000000000e9f480] .kernel_init_freeable+0x3a8/0x494
  [c0000000f31bbda8] [c000000000002798] .kernel_init+0x24/0x148
  [c0000000f31bbe28] [c0000000000009e8] .ret_from_kernel_thread+0x58/0x70
  Instruction dump:
  4e800020 2ba50003 40dd003c 3925fffc 5488402e 7929f082 7d082378 39290001 
  550a801e 7d2903a6 7d4a4378 794a0020 <91430000> 38630004 4200fff8 70a50003 


Comparing a working vs broken kernel, it seems to boil down to the fact
that we're filtering out more PTE bits now that we use pte_pgprot() in
ioremap_prot().

With the old code we get:
  ioremap_prot: addr 0xff800000 flags 0x241215
  ioremap_prot: addr 0xff800000 flags 0x241215
  map_kernel_page: ea 0x8000080080080000 pa 0xff800000 pte 0xff800241215


And now we get:
  ioremap_prot: addr 0xff800000 flags 0x241215 pte 0x241215
  ioremap_prot: addr 0xff800000 pte 0x241215
  ioremap_prot: addr 0xff800000 prot 0x241014
  map_kernel_page: ea 0x8000080080080000 pa 0xff800000 pte 0xff800241014

So we're losing 0x201, which for nohash book3e is:

  #define _PAGE_PRESENT	0x000001 /* software: pte contains a translation */
  #define _PAGE_PSIZE_4K	0x000200


I haven't worked out if it's one or both of those that matter.

The question is what's the right way to fix it? Should pte_pgprot() not
be filtering those bits out on book3e?

cheers
Christophe Leroy Oct. 17, 2018, 6 a.m. | #2
On 10/17/2018 12:59 AM, Michael Ellerman wrote:
> Christophe Leroy <christophe.leroy@c-s.fr> writes:
> 
>> Get rid of platform specific _PAGE_XXXX in powerpc common code and
>> use helpers instead.
>>
>> mm/dump_linuxpagetables.c will be handled separately
>>
>> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
>> ---
>>   arch/powerpc/include/asm/book3s/32/pgtable.h |  9 +++------
>>   arch/powerpc/include/asm/nohash/32/pgtable.h | 12 ++++++++----
>>   arch/powerpc/include/asm/nohash/pgtable.h    |  3 +--
>>   arch/powerpc/mm/pgtable.c                    | 21 +++++++--------------
>>   arch/powerpc/mm/pgtable_32.c                 | 15 ++++++++-------
>>   arch/powerpc/mm/pgtable_64.c                 | 14 +++++++-------
>>   arch/powerpc/xmon/xmon.c                     | 12 +++++++-----
>>   7 files changed, 41 insertions(+), 45 deletions(-)
> 
> So turns out this patch *also* breaks my p5020ds :)
> 
> Even with patch 4 merged, see next.
> 
> It's the same crash:
> 
>    pcieport 2000:00:00.0: AER enabled with IRQ 480
>    Unable to handle kernel paging request for data at address 0x8000080080080000
>    Faulting instruction address: 0xc0000000000192cc
>    Oops: Kernel access of bad area, sig: 11 [#1]
>    BE SMP NR_CPUS=24 CoreNet Generic
>    Modules linked in:
>    CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc3-gcc7x-g98c847323b3a #1
>    NIP:  c0000000000192cc LR: c0000000005d0f9c CTR: 0000000000100000
>    REGS: c0000000f31bb400 TRAP: 0300   Not tainted  (4.19.0-rc3-gcc7x-g98c847323b3a)
>    MSR:  0000000080029000 <CE,EE,ME>  CR: 24000224  XER: 00000000
>    DEAR: 8000080080080000 ESR: 0000000000800000 IRQMASK: 0
>    GPR00: c0000000005d0f84 c0000000f31bb688 c00000000117dc00 8000080080080000
>    GPR04: 0000000000000000 0000000000400000 00000ffbff241010 c0000000f31b8000
>    GPR08: 0000000000000000 0000000000100000 0000000000000000 c0000000012d4710
>    GPR12: 0000000084000422 c0000000012ff000 c000000000002774 0000000000000000
>    GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>    GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>    GPR24: 0000000000000000 0000000000000000 8000080080080000 c0000000ffff89a8
>    GPR28: c0000000f3576400 c0000000f3576410 0000000000400000 c0000000012ecc98
>    NIP [c0000000000192cc] ._memset_io+0x6c/0x9c
>    LR [c0000000005d0f9c] .fsl_qman_probe+0x198/0x928
>    Call Trace:
>    [c0000000f31bb688] [c0000000005d0f84] .fsl_qman_probe+0x180/0x928 (unreliable)
>    [c0000000f31bb728] [c0000000006432ec] .platform_drv_probe+0x60/0xb4
>    [c0000000f31bb7a8] [c00000000064083c] .really_probe+0x294/0x35c
>    [c0000000f31bb848] [c000000000640d2c] .__driver_attach+0x148/0x14c
>    [c0000000f31bb8d8] [c00000000063d7dc] .bus_for_each_dev+0xb0/0x118
>    [c0000000f31bb988] [c00000000063ff28] .driver_attach+0x34/0x4c
>    [c0000000f31bba08] [c00000000063f648] .bus_add_driver+0x174/0x2bc
>    [c0000000f31bbaa8] [c0000000006418bc] .driver_register+0x90/0x180
>    [c0000000f31bbb28] [c000000000643270] .__platform_driver_register+0x60/0x7c
>    [c0000000f31bbba8] [c000000000ee2a70] .fsl_qman_driver_init+0x24/0x38
>    [c0000000f31bbc18] [c0000000000023fc] .do_one_initcall+0x64/0x2b8
>    [c0000000f31bbcf8] [c000000000e9f480] .kernel_init_freeable+0x3a8/0x494
>    [c0000000f31bbda8] [c000000000002798] .kernel_init+0x24/0x148
>    [c0000000f31bbe28] [c0000000000009e8] .ret_from_kernel_thread+0x58/0x70
>    Instruction dump:
>    4e800020 2ba50003 40dd003c 3925fffc 5488402e 7929f082 7d082378 39290001
>    550a801e 7d2903a6 7d4a4378 794a0020 <91430000> 38630004 4200fff8 70a50003
> 
> 
> Comparing a working vs broken kernel, it seems to boil down to the fact
> that we're filtering out more PTE bits now that we use pte_pgprot() in
> ioremap_prot().
> 
> With the old code we get:
>    ioremap_prot: addr 0xff800000 flags 0x241215
>    ioremap_prot: addr 0xff800000 flags 0x241215
>    map_kernel_page: ea 0x8000080080080000 pa 0xff800000 pte 0xff800241215
> 
> 
> And now we get:
>    ioremap_prot: addr 0xff800000 flags 0x241215 pte 0x241215
>    ioremap_prot: addr 0xff800000 pte 0x241215
>    ioremap_prot: addr 0xff800000 prot 0x241014
>    map_kernel_page: ea 0x8000080080080000 pa 0xff800000 pte 0xff800241014
> 
> So we're losing 0x201, which for nohash book3e is:
> 
>    #define _PAGE_PRESENT	0x000001 /* software: pte contains a translation */
>    #define _PAGE_PSIZE_4K	0x000200
> 
> 
> I haven't worked out if it's one or both of those that matter.

At least missing _PAGE_PRESENT is an issue I believe.
> 
> The question is what's the right way to fix it? Should pte_pgprot() not
> be filtering those bits out on book3e?

I think we should not use pte_pggrot() for that then. What about the 
below fix ?

Christophe

From: Christophe Leroy <christophe.leroy@c-s.fr>
Date: Wed, 17 Oct 2018 05:56:25 +0000
Subject: [PATCH] powerpc/mm: don't use pte_pgprot() in ioremap_prot()

pte_pgprot() filters out some required flags like _PAGE_PRESENT.

This patch replaces pte_pgprot() by __pgprot(pte_val())
in ioremap_prot()

Fixes: 26973fa5ac0e ("powerpc/mm: use pte helpers in generic code")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
  arch/powerpc/mm/pgtable_32.c | 3 ++-
  arch/powerpc/mm/pgtable_64.c | 4 ++--
  2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
index 5877f5aa8f5d..a606e2f4937b 100644
--- a/arch/powerpc/mm/pgtable_32.c
+++ b/arch/powerpc/mm/pgtable_32.c
@@ -122,7 +122,8 @@ ioremap_prot(phys_addr_t addr, unsigned long size, 
unsigned long flags)
  	pte = pte_exprotect(pte);
  	pte = pte_mkprivileged(pte);

-	return __ioremap_caller(addr, size, pte_pgprot(pte), 
__builtin_return_address(0));
+	return __ioremap_caller(addr, size, __pgprot(pte_val(pte)),
+				__builtin_return_address(0));
  }
  EXPORT_SYMBOL(ioremap_prot);

diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index fb1375c07e8c..836bf436cabb 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -245,8 +245,8 @@ void __iomem * ioremap_prot(phys_addr_t addr, 
unsigned long size,
  	pte = pte_mkprivileged(pte);

  	if (ppc_md.ioremap)
-		return ppc_md.ioremap(addr, size, pte_pgprot(pte), caller);
-	return __ioremap_caller(addr, size, pte_pgprot(pte), caller);
+		return ppc_md.ioremap(addr, size, __pgprot(pte_val(pte)), caller);
+	return __ioremap_caller(addr, size, __pgprot(pte_val(pte)), caller);
  }
Aneesh Kumar K.V Oct. 17, 2018, 9:39 a.m. | #3
Christophe Leroy <christophe.leroy@c-s.fr> writes:

> On 10/17/2018 12:59 AM, Michael Ellerman wrote:
>> Christophe Leroy <christophe.leroy@c-s.fr> writes:
>> 
>>> Get rid of platform specific _PAGE_XXXX in powerpc common code and
>>> use helpers instead.
>>>
>>> mm/dump_linuxpagetables.c will be handled separately
>>>
>>> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>>> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
>>> ---
>>>   arch/powerpc/include/asm/book3s/32/pgtable.h |  9 +++------
>>>   arch/powerpc/include/asm/nohash/32/pgtable.h | 12 ++++++++----
>>>   arch/powerpc/include/asm/nohash/pgtable.h    |  3 +--
>>>   arch/powerpc/mm/pgtable.c                    | 21 +++++++--------------
>>>   arch/powerpc/mm/pgtable_32.c                 | 15 ++++++++-------
>>>   arch/powerpc/mm/pgtable_64.c                 | 14 +++++++-------
>>>   arch/powerpc/xmon/xmon.c                     | 12 +++++++-----
>>>   7 files changed, 41 insertions(+), 45 deletions(-)
>> 
>> So turns out this patch *also* breaks my p5020ds :)
>> 
>> Even with patch 4 merged, see next.
>> 
>> It's the same crash:
>> 
>>    pcieport 2000:00:00.0: AER enabled with IRQ 480
>>    Unable to handle kernel paging request for data at address 0x8000080080080000
>>    Faulting instruction address: 0xc0000000000192cc
>>    Oops: Kernel access of bad area, sig: 11 [#1]
>>    BE SMP NR_CPUS=24 CoreNet Generic
>>    Modules linked in:
>>    CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc3-gcc7x-g98c847323b3a #1
>>    NIP:  c0000000000192cc LR: c0000000005d0f9c CTR: 0000000000100000
>>    REGS: c0000000f31bb400 TRAP: 0300   Not tainted  (4.19.0-rc3-gcc7x-g98c847323b3a)
>>    MSR:  0000000080029000 <CE,EE,ME>  CR: 24000224  XER: 00000000
>>    DEAR: 8000080080080000 ESR: 0000000000800000 IRQMASK: 0
>>    GPR00: c0000000005d0f84 c0000000f31bb688 c00000000117dc00 8000080080080000
>>    GPR04: 0000000000000000 0000000000400000 00000ffbff241010 c0000000f31b8000
>>    GPR08: 0000000000000000 0000000000100000 0000000000000000 c0000000012d4710
>>    GPR12: 0000000084000422 c0000000012ff000 c000000000002774 0000000000000000
>>    GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>    GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>    GPR24: 0000000000000000 0000000000000000 8000080080080000 c0000000ffff89a8
>>    GPR28: c0000000f3576400 c0000000f3576410 0000000000400000 c0000000012ecc98
>>    NIP [c0000000000192cc] ._memset_io+0x6c/0x9c
>>    LR [c0000000005d0f9c] .fsl_qman_probe+0x198/0x928
>>    Call Trace:
>>    [c0000000f31bb688] [c0000000005d0f84] .fsl_qman_probe+0x180/0x928 (unreliable)
>>    [c0000000f31bb728] [c0000000006432ec] .platform_drv_probe+0x60/0xb4
>>    [c0000000f31bb7a8] [c00000000064083c] .really_probe+0x294/0x35c
>>    [c0000000f31bb848] [c000000000640d2c] .__driver_attach+0x148/0x14c
>>    [c0000000f31bb8d8] [c00000000063d7dc] .bus_for_each_dev+0xb0/0x118
>>    [c0000000f31bb988] [c00000000063ff28] .driver_attach+0x34/0x4c
>>    [c0000000f31bba08] [c00000000063f648] .bus_add_driver+0x174/0x2bc
>>    [c0000000f31bbaa8] [c0000000006418bc] .driver_register+0x90/0x180
>>    [c0000000f31bbb28] [c000000000643270] .__platform_driver_register+0x60/0x7c
>>    [c0000000f31bbba8] [c000000000ee2a70] .fsl_qman_driver_init+0x24/0x38
>>    [c0000000f31bbc18] [c0000000000023fc] .do_one_initcall+0x64/0x2b8
>>    [c0000000f31bbcf8] [c000000000e9f480] .kernel_init_freeable+0x3a8/0x494
>>    [c0000000f31bbda8] [c000000000002798] .kernel_init+0x24/0x148
>>    [c0000000f31bbe28] [c0000000000009e8] .ret_from_kernel_thread+0x58/0x70
>>    Instruction dump:
>>    4e800020 2ba50003 40dd003c 3925fffc 5488402e 7929f082 7d082378 39290001
>>    550a801e 7d2903a6 7d4a4378 794a0020 <91430000> 38630004 4200fff8 70a50003
>> 
>> 
>> Comparing a working vs broken kernel, it seems to boil down to the fact
>> that we're filtering out more PTE bits now that we use pte_pgprot() in
>> ioremap_prot().
>> 
>> With the old code we get:
>>    ioremap_prot: addr 0xff800000 flags 0x241215
>>    ioremap_prot: addr 0xff800000 flags 0x241215
>>    map_kernel_page: ea 0x8000080080080000 pa 0xff800000 pte 0xff800241215
>> 
>> 
>> And now we get:
>>    ioremap_prot: addr 0xff800000 flags 0x241215 pte 0x241215
>>    ioremap_prot: addr 0xff800000 pte 0x241215
>>    ioremap_prot: addr 0xff800000 prot 0x241014
>>    map_kernel_page: ea 0x8000080080080000 pa 0xff800000 pte 0xff800241014
>> 
>> So we're losing 0x201, which for nohash book3e is:
>> 
>>    #define _PAGE_PRESENT	0x000001 /* software: pte contains a translation */
>>    #define _PAGE_PSIZE_4K	0x000200
>> 
>> 
>> I haven't worked out if it's one or both of those that matter.
>
> At least missing _PAGE_PRESENT is an issue I believe.
>> 
>> The question is what's the right way to fix it? Should pte_pgprot() not
>> be filtering those bits out on book3e?
>
> I think we should not use pte_pggrot() for that then. What about the 
> below fix ?
>
> Christophe
>
> From: Christophe Leroy <christophe.leroy@c-s.fr>
> Date: Wed, 17 Oct 2018 05:56:25 +0000
> Subject: [PATCH] powerpc/mm: don't use pte_pgprot() in ioremap_prot()
>
> pte_pgprot() filters out some required flags like _PAGE_PRESENT.
>
> This patch replaces pte_pgprot() by __pgprot(pte_val())
> in ioremap_prot()
>
> Fixes: 26973fa5ac0e ("powerpc/mm: use pte helpers in generic code")
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> ---
>   arch/powerpc/mm/pgtable_32.c | 3 ++-
>   arch/powerpc/mm/pgtable_64.c | 4 ++--
>   2 files changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
> index 5877f5aa8f5d..a606e2f4937b 100644
> --- a/arch/powerpc/mm/pgtable_32.c
> +++ b/arch/powerpc/mm/pgtable_32.c
> @@ -122,7 +122,8 @@ ioremap_prot(phys_addr_t addr, unsigned long size, 
> unsigned long flags)
>   	pte = pte_exprotect(pte);
>   	pte = pte_mkprivileged(pte);
>
> -	return __ioremap_caller(addr, size, pte_pgprot(pte), 
> __builtin_return_address(0));
> +	return __ioremap_caller(addr, size, __pgprot(pte_val(pte)),
> +				__builtin_return_address(0));


That means we pass the pfn bits also to __ioremap_caller right? How about

From b4d5e0f24f8482375b2dd86afaced26ebf716600 Mon Sep 17 00:00:00 2001
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Date: Wed, 17 Oct 2018 14:07:50 +0530
Subject: [PATCH] powerpc/mm: Make pte_pgprot return all pte bits

Other archs do the same and instead of adding required pte bits (which got
masked out) in __ioremap_at, make sure we filter only pfn bits out.

Fixes: 26973fa5ac0e ("powerpc/mm: use pte helpers in generic code")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
 arch/powerpc/include/asm/book3s/32/pgtable.h       |  6 ------
 arch/powerpc/include/asm/book3s/64/pgtable.h       |  8 --------
 arch/powerpc/include/asm/nohash/32/pte-40x.h       |  5 -----
 arch/powerpc/include/asm/nohash/32/pte-44x.h       |  5 -----
 arch/powerpc/include/asm/nohash/32/pte-8xx.h       |  5 -----
 arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h |  5 -----
 arch/powerpc/include/asm/nohash/pgtable.h          |  1 -
 arch/powerpc/include/asm/nohash/pte-book3e.h       |  5 -----
 arch/powerpc/include/asm/pgtable.h                 | 10 ++++++++++
 9 files changed, 10 insertions(+), 40 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 0fbd4c642b51..e61dd3ae5bc0 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -48,11 +48,6 @@ static inline bool pte_user(pte_t pte)
 #define _PAGE_CHG_MASK	(PTE_RPN_MASK | _PAGE_HASHPTE | _PAGE_DIRTY | \
 			 _PAGE_ACCESSED | _PAGE_SPECIAL)
 
-/* Mask of bits returned by pte_pgprot() */
-#define PAGE_PROT_BITS	(_PAGE_GUARDED | _PAGE_COHERENT | _PAGE_NO_CACHE | \
-			 _PAGE_WRITETHRU | _PAGE_USER | _PAGE_ACCESSED | \
-			 _PAGE_RW | _PAGE_DIRTY)
-
 /*
  * We define 2 sets of base prot bits, one for basic pages (ie,
  * cacheable kernel and user pages) and one for non cacheable
@@ -396,7 +391,6 @@ static inline int pte_young(pte_t pte)		{ return !!(pte_val(pte) & _PAGE_ACCESSE
 static inline int pte_special(pte_t pte)	{ return !!(pte_val(pte) & _PAGE_SPECIAL); }
 static inline int pte_none(pte_t pte)		{ return (pte_val(pte) & ~_PTE_NONE_MASK) == 0; }
 static inline bool pte_exec(pte_t pte)		{ return true; }
-static inline pgprot_t pte_pgprot(pte_t pte)	{ return __pgprot(pte_val(pte) & PAGE_PROT_BITS); }
 
 static inline int pte_present(pte_t pte)
 {
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index c34a161dc651..cb5dd4078d42 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -128,13 +128,6 @@
 
 #define H_PTE_PKEY  (H_PTE_PKEY_BIT0 | H_PTE_PKEY_BIT1 | H_PTE_PKEY_BIT2 | \
 		     H_PTE_PKEY_BIT3 | H_PTE_PKEY_BIT4)
-/*
- * Mask of bits returned by pte_pgprot()
- */
-#define PAGE_PROT_BITS  (_PAGE_SAO | _PAGE_NON_IDEMPOTENT | _PAGE_TOLERANT | \
-			 H_PAGE_4K_PFN | _PAGE_PRIVILEGED | _PAGE_ACCESSED | \
-			 _PAGE_READ | _PAGE_WRITE |  _PAGE_DIRTY | _PAGE_EXEC | \
-			 _PAGE_SOFT_DIRTY | H_PTE_PKEY)
 /*
  * We define 2 sets of base prot bits, one for basic pages (ie,
  * cacheable kernel and user pages) and one for non cacheable
@@ -496,7 +489,6 @@ static inline bool pte_exec(pte_t pte)
 	return !!(pte_raw(pte) & cpu_to_be64(_PAGE_EXEC));
 }
 
-static inline pgprot_t pte_pgprot(pte_t pte)	{ return __pgprot(pte_val(pte) & PAGE_PROT_BITS); }
 
 #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
 static inline bool pte_soft_dirty(pte_t pte)
diff --git a/arch/powerpc/include/asm/nohash/32/pte-40x.h b/arch/powerpc/include/asm/nohash/32/pte-40x.h
index 7a8b3c94592f..661f4599f2fc 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-40x.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-40x.h
@@ -73,11 +73,6 @@
 /* Until my rework is finished, 40x still needs atomic PTE updates */
 #define PTE_ATOMIC_UPDATES	1
 
-/* Mask of bits returned by pte_pgprot() */
-#define PAGE_PROT_BITS	(_PAGE_GUARDED | _PAGE_NO_CACHE | \
-			 _PAGE_WRITETHRU | _PAGE_USER | _PAGE_ACCESSED | \
-			 _PAGE_RW | _PAGE_HWWRITE | _PAGE_DIRTY | _PAGE_EXEC)
-
 #define _PAGE_BASE_NC	(_PAGE_PRESENT | _PAGE_ACCESSED)
 #define _PAGE_BASE	(_PAGE_BASE_NC)
 
diff --git a/arch/powerpc/include/asm/nohash/32/pte-44x.h b/arch/powerpc/include/asm/nohash/32/pte-44x.h
index 8d6b268a986f..78bc304f750e 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-44x.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-44x.h
@@ -93,11 +93,6 @@
 #define _PAGE_KERNEL_RW		(_PAGE_DIRTY | _PAGE_RW)
 #define _PAGE_KERNEL_RWX	(_PAGE_DIRTY | _PAGE_RW | _PAGE_EXEC)
 
-/* Mask of bits returned by pte_pgprot() */
-#define PAGE_PROT_BITS	(_PAGE_GUARDED | _PAGE_COHERENT | _PAGE_NO_CACHE | \
-			 _PAGE_WRITETHRU | _PAGE_USER | _PAGE_ACCESSED | \
-			 _PAGE_RW | _PAGE_DIRTY | _PAGE_EXEC)
-
 /* TODO: Add large page lowmem mapping support */
 #define _PMD_PRESENT	0
 #define _PMD_PRESENT_MASK (PAGE_MASK)
diff --git a/arch/powerpc/include/asm/nohash/32/pte-8xx.h b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
index 1c57efac089d..6bfe041ef59d 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-8xx.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
@@ -55,11 +55,6 @@
 #define _PAGE_KERNEL_RW		(_PAGE_SH | _PAGE_DIRTY)
 #define _PAGE_KERNEL_RWX	(_PAGE_SH | _PAGE_DIRTY | _PAGE_EXEC)
 
-/* Mask of bits returned by pte_pgprot() */
-#define PAGE_PROT_BITS	(_PAGE_GUARDED | _PAGE_NO_CACHE | \
-			 _PAGE_ACCESSED | _PAGE_RO | _PAGE_NA | \
-			 _PAGE_SH | _PAGE_DIRTY | _PAGE_EXEC)
-
 #define _PMD_PRESENT	0x0001
 #define _PMD_PRESENT_MASK	_PMD_PRESENT
 #define _PMD_BAD	0x0fd0
diff --git a/arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h b/arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h
index 1ecf60fe0909..0fc1bd42bb3e 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h
@@ -39,11 +39,6 @@
 /* No page size encoding in the linux PTE */
 #define _PAGE_PSIZE		0
 
-/* Mask of bits returned by pte_pgprot() */
-#define PAGE_PROT_BITS	(_PAGE_GUARDED | _PAGE_COHERENT | _PAGE_NO_CACHE | \
-			 _PAGE_WRITETHRU | _PAGE_USER | _PAGE_ACCESSED | \
-			 _PAGE_RW | _PAGE_DIRTY | _PAGE_EXEC)
-
 #define _PMD_PRESENT	0
 #define _PMD_PRESENT_MASK (PAGE_MASK)
 #define _PMD_BAD	(~PAGE_MASK)
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h b/arch/powerpc/include/asm/nohash/pgtable.h
index 04e9f0922ad4..70ff23974b59 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -52,7 +52,6 @@ static inline int pte_none(pte_t pte)		{ return (pte_val(pte) & ~_PTE_NONE_MASK)
 static inline bool pte_hashpte(pte_t pte)	{ return false; }
 static inline bool pte_ci(pte_t pte)		{ return pte_val(pte) & _PAGE_NO_CACHE; }
 static inline bool pte_exec(pte_t pte)		{ return pte_val(pte) & _PAGE_EXEC; }
-static inline pgprot_t pte_pgprot(pte_t pte)	{ return __pgprot(pte_val(pte) & PAGE_PROT_BITS); }
 
 #ifdef CONFIG_NUMA_BALANCING
 /*
diff --git a/arch/powerpc/include/asm/nohash/pte-book3e.h b/arch/powerpc/include/asm/nohash/pte-book3e.h
index 58eef8cb569d..f95ab6eaf441 100644
--- a/arch/powerpc/include/asm/nohash/pte-book3e.h
+++ b/arch/powerpc/include/asm/nohash/pte-book3e.h
@@ -82,11 +82,6 @@
 #define _PTE_NONE_MASK	0
 #endif
 
-/* Mask of bits returned by pte_pgprot() */
-#define PAGE_PROT_BITS	(_PAGE_GUARDED | _PAGE_COHERENT | _PAGE_NO_CACHE | \
-			 _PAGE_WRITETHRU | _PAGE_USER | _PAGE_ACCESSED | \
-			 _PAGE_PRIVILEGED | _PAGE_RW | _PAGE_DIRTY | _PAGE_EXEC)
-
 /*
  * We define 2 sets of base prot bits, one for basic pages (ie,
  * cacheable kernel and user pages) and one for non cacheable
diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
index fb4b85bba110..9679b7519a35 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -46,6 +46,16 @@ struct mm_struct;
 /* Keep these as a macros to avoid include dependency mess */
 #define pte_page(x)		pfn_to_page(pte_pfn(x))
 #define mk_pte(page, pgprot)	pfn_pte(page_to_pfn(page), (pgprot))
+/*
+ * Select all bits except the pfn
+ */
+static inline pgprot_t pte_pgprot(pte_t pte)
+{
+	unsigned long pte_flags;
+
+	pte_flags = pte_val(pte) & ~PTE_RPN_MASK;
+	return __pgprot(pte_flags);
+}
 
 /*
  * ZERO_PAGE is a global shared page that is always zero: used
Christophe Leroy Oct. 17, 2018, 9:55 a.m. | #4
Le 17/10/2018 à 11:39, Aneesh Kumar K.V a écrit :
> Christophe Leroy <christophe.leroy@c-s.fr> writes:
> 
>> On 10/17/2018 12:59 AM, Michael Ellerman wrote:
>>> Christophe Leroy <christophe.leroy@c-s.fr> writes:
>>>
>>>> Get rid of platform specific _PAGE_XXXX in powerpc common code and
>>>> use helpers instead.
>>>>
>>>> mm/dump_linuxpagetables.c will be handled separately
>>>>
>>>> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>>>> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
>>>> ---
>>>>    arch/powerpc/include/asm/book3s/32/pgtable.h |  9 +++------
>>>>    arch/powerpc/include/asm/nohash/32/pgtable.h | 12 ++++++++----
>>>>    arch/powerpc/include/asm/nohash/pgtable.h    |  3 +--
>>>>    arch/powerpc/mm/pgtable.c                    | 21 +++++++--------------
>>>>    arch/powerpc/mm/pgtable_32.c                 | 15 ++++++++-------
>>>>    arch/powerpc/mm/pgtable_64.c                 | 14 +++++++-------
>>>>    arch/powerpc/xmon/xmon.c                     | 12 +++++++-----
>>>>    7 files changed, 41 insertions(+), 45 deletions(-)
>>>
>>> So turns out this patch *also* breaks my p5020ds :)
>>>
>>> Even with patch 4 merged, see next.
>>>
>>> It's the same crash:
>>>
>>>     pcieport 2000:00:00.0: AER enabled with IRQ 480
>>>     Unable to handle kernel paging request for data at address 0x8000080080080000
>>>     Faulting instruction address: 0xc0000000000192cc
>>>     Oops: Kernel access of bad area, sig: 11 [#1]
>>>     BE SMP NR_CPUS=24 CoreNet Generic
>>>     Modules linked in:
>>>     CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc3-gcc7x-g98c847323b3a #1
>>>     NIP:  c0000000000192cc LR: c0000000005d0f9c CTR: 0000000000100000
>>>     REGS: c0000000f31bb400 TRAP: 0300   Not tainted  (4.19.0-rc3-gcc7x-g98c847323b3a)
>>>     MSR:  0000000080029000 <CE,EE,ME>  CR: 24000224  XER: 00000000
>>>     DEAR: 8000080080080000 ESR: 0000000000800000 IRQMASK: 0
>>>     GPR00: c0000000005d0f84 c0000000f31bb688 c00000000117dc00 8000080080080000
>>>     GPR04: 0000000000000000 0000000000400000 00000ffbff241010 c0000000f31b8000
>>>     GPR08: 0000000000000000 0000000000100000 0000000000000000 c0000000012d4710
>>>     GPR12: 0000000084000422 c0000000012ff000 c000000000002774 0000000000000000
>>>     GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>>     GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>>     GPR24: 0000000000000000 0000000000000000 8000080080080000 c0000000ffff89a8
>>>     GPR28: c0000000f3576400 c0000000f3576410 0000000000400000 c0000000012ecc98
>>>     NIP [c0000000000192cc] ._memset_io+0x6c/0x9c
>>>     LR [c0000000005d0f9c] .fsl_qman_probe+0x198/0x928
>>>     Call Trace:
>>>     [c0000000f31bb688] [c0000000005d0f84] .fsl_qman_probe+0x180/0x928 (unreliable)
>>>     [c0000000f31bb728] [c0000000006432ec] .platform_drv_probe+0x60/0xb4
>>>     [c0000000f31bb7a8] [c00000000064083c] .really_probe+0x294/0x35c
>>>     [c0000000f31bb848] [c000000000640d2c] .__driver_attach+0x148/0x14c
>>>     [c0000000f31bb8d8] [c00000000063d7dc] .bus_for_each_dev+0xb0/0x118
>>>     [c0000000f31bb988] [c00000000063ff28] .driver_attach+0x34/0x4c
>>>     [c0000000f31bba08] [c00000000063f648] .bus_add_driver+0x174/0x2bc
>>>     [c0000000f31bbaa8] [c0000000006418bc] .driver_register+0x90/0x180
>>>     [c0000000f31bbb28] [c000000000643270] .__platform_driver_register+0x60/0x7c
>>>     [c0000000f31bbba8] [c000000000ee2a70] .fsl_qman_driver_init+0x24/0x38
>>>     [c0000000f31bbc18] [c0000000000023fc] .do_one_initcall+0x64/0x2b8
>>>     [c0000000f31bbcf8] [c000000000e9f480] .kernel_init_freeable+0x3a8/0x494
>>>     [c0000000f31bbda8] [c000000000002798] .kernel_init+0x24/0x148
>>>     [c0000000f31bbe28] [c0000000000009e8] .ret_from_kernel_thread+0x58/0x70
>>>     Instruction dump:
>>>     4e800020 2ba50003 40dd003c 3925fffc 5488402e 7929f082 7d082378 39290001
>>>     550a801e 7d2903a6 7d4a4378 794a0020 <91430000> 38630004 4200fff8 70a50003
>>>
>>>
>>> Comparing a working vs broken kernel, it seems to boil down to the fact
>>> that we're filtering out more PTE bits now that we use pte_pgprot() in
>>> ioremap_prot().
>>>
>>> With the old code we get:
>>>     ioremap_prot: addr 0xff800000 flags 0x241215
>>>     ioremap_prot: addr 0xff800000 flags 0x241215
>>>     map_kernel_page: ea 0x8000080080080000 pa 0xff800000 pte 0xff800241215
>>>
>>>
>>> And now we get:
>>>     ioremap_prot: addr 0xff800000 flags 0x241215 pte 0x241215
>>>     ioremap_prot: addr 0xff800000 pte 0x241215
>>>     ioremap_prot: addr 0xff800000 prot 0x241014
>>>     map_kernel_page: ea 0x8000080080080000 pa 0xff800000 pte 0xff800241014
>>>
>>> So we're losing 0x201, which for nohash book3e is:
>>>
>>>     #define _PAGE_PRESENT	0x000001 /* software: pte contains a translation */
>>>     #define _PAGE_PSIZE_4K	0x000200
>>>
>>>
>>> I haven't worked out if it's one or both of those that matter.
>>
>> At least missing _PAGE_PRESENT is an issue I believe.
>>>
>>> The question is what's the right way to fix it? Should pte_pgprot() not
>>> be filtering those bits out on book3e?
>>
>> I think we should not use pte_pggrot() for that then. What about the
>> below fix ?
>>
>> Christophe
>>
>> From: Christophe Leroy <christophe.leroy@c-s.fr>
>> Date: Wed, 17 Oct 2018 05:56:25 +0000
>> Subject: [PATCH] powerpc/mm: don't use pte_pgprot() in ioremap_prot()
>>
>> pte_pgprot() filters out some required flags like _PAGE_PRESENT.
>>
>> This patch replaces pte_pgprot() by __pgprot(pte_val())
>> in ioremap_prot()
>>
>> Fixes: 26973fa5ac0e ("powerpc/mm: use pte helpers in generic code")
>> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
>> ---
>>    arch/powerpc/mm/pgtable_32.c | 3 ++-
>>    arch/powerpc/mm/pgtable_64.c | 4 ++--
>>    2 files changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
>> index 5877f5aa8f5d..a606e2f4937b 100644
>> --- a/arch/powerpc/mm/pgtable_32.c
>> +++ b/arch/powerpc/mm/pgtable_32.c
>> @@ -122,7 +122,8 @@ ioremap_prot(phys_addr_t addr, unsigned long size,
>> unsigned long flags)
>>    	pte = pte_exprotect(pte);
>>    	pte = pte_mkprivileged(pte);
>>
>> -	return __ioremap_caller(addr, size, pte_pgprot(pte),
>> __builtin_return_address(0));
>> +	return __ioremap_caller(addr, size, __pgprot(pte_val(pte)),
>> +				__builtin_return_address(0));
> 
> 
> That means we pass the pfn bits also to __ioremap_caller right? How about
> 
>  From b4d5e0f24f8482375b2dd86afaced26ebf716600 Mon Sep 17 00:00:00 2001
> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
> Date: Wed, 17 Oct 2018 14:07:50 +0530
> Subject: [PATCH] powerpc/mm: Make pte_pgprot return all pte bits
> 
> Other archs do the same and instead of adding required pte bits (which got
> masked out) in __ioremap_at, make sure we filter only pfn bits out.
> 
> Fixes: 26973fa5ac0e ("powerpc/mm: use pte helpers in generic code")
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>

Looks good for me.

Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr>

> ---
>   arch/powerpc/include/asm/book3s/32/pgtable.h       |  6 ------
>   arch/powerpc/include/asm/book3s/64/pgtable.h       |  8 --------
>   arch/powerpc/include/asm/nohash/32/pte-40x.h       |  5 -----
>   arch/powerpc/include/asm/nohash/32/pte-44x.h       |  5 -----
>   arch/powerpc/include/asm/nohash/32/pte-8xx.h       |  5 -----
>   arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h |  5 -----
>   arch/powerpc/include/asm/nohash/pgtable.h          |  1 -
>   arch/powerpc/include/asm/nohash/pte-book3e.h       |  5 -----
>   arch/powerpc/include/asm/pgtable.h                 | 10 ++++++++++
>   9 files changed, 10 insertions(+), 40 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h
> index 0fbd4c642b51..e61dd3ae5bc0 100644
> --- a/arch/powerpc/include/asm/book3s/32/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
> @@ -48,11 +48,6 @@ static inline bool pte_user(pte_t pte)
>   #define _PAGE_CHG_MASK	(PTE_RPN_MASK | _PAGE_HASHPTE | _PAGE_DIRTY | \
>   			 _PAGE_ACCESSED | _PAGE_SPECIAL)
>   
> -/* Mask of bits returned by pte_pgprot() */
> -#define PAGE_PROT_BITS	(_PAGE_GUARDED | _PAGE_COHERENT | _PAGE_NO_CACHE | \
> -			 _PAGE_WRITETHRU | _PAGE_USER | _PAGE_ACCESSED | \
> -			 _PAGE_RW | _PAGE_DIRTY)
> -
>   /*
>    * We define 2 sets of base prot bits, one for basic pages (ie,
>    * cacheable kernel and user pages) and one for non cacheable
> @@ -396,7 +391,6 @@ static inline int pte_young(pte_t pte)		{ return !!(pte_val(pte) & _PAGE_ACCESSE
>   static inline int pte_special(pte_t pte)	{ return !!(pte_val(pte) & _PAGE_SPECIAL); }
>   static inline int pte_none(pte_t pte)		{ return (pte_val(pte) & ~_PTE_NONE_MASK) == 0; }
>   static inline bool pte_exec(pte_t pte)		{ return true; }
> -static inline pgprot_t pte_pgprot(pte_t pte)	{ return __pgprot(pte_val(pte) & PAGE_PROT_BITS); }
>   
>   static inline int pte_present(pte_t pte)
>   {
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index c34a161dc651..cb5dd4078d42 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -128,13 +128,6 @@
>   
>   #define H_PTE_PKEY  (H_PTE_PKEY_BIT0 | H_PTE_PKEY_BIT1 | H_PTE_PKEY_BIT2 | \
>   		     H_PTE_PKEY_BIT3 | H_PTE_PKEY_BIT4)
> -/*
> - * Mask of bits returned by pte_pgprot()
> - */
> -#define PAGE_PROT_BITS  (_PAGE_SAO | _PAGE_NON_IDEMPOTENT | _PAGE_TOLERANT | \
> -			 H_PAGE_4K_PFN | _PAGE_PRIVILEGED | _PAGE_ACCESSED | \
> -			 _PAGE_READ | _PAGE_WRITE |  _PAGE_DIRTY | _PAGE_EXEC | \
> -			 _PAGE_SOFT_DIRTY | H_PTE_PKEY)
>   /*
>    * We define 2 sets of base prot bits, one for basic pages (ie,
>    * cacheable kernel and user pages) and one for non cacheable
> @@ -496,7 +489,6 @@ static inline bool pte_exec(pte_t pte)
>   	return !!(pte_raw(pte) & cpu_to_be64(_PAGE_EXEC));
>   }
>   
> -static inline pgprot_t pte_pgprot(pte_t pte)	{ return __pgprot(pte_val(pte) & PAGE_PROT_BITS); }
>   
>   #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
>   static inline bool pte_soft_dirty(pte_t pte)
> diff --git a/arch/powerpc/include/asm/nohash/32/pte-40x.h b/arch/powerpc/include/asm/nohash/32/pte-40x.h
> index 7a8b3c94592f..661f4599f2fc 100644
> --- a/arch/powerpc/include/asm/nohash/32/pte-40x.h
> +++ b/arch/powerpc/include/asm/nohash/32/pte-40x.h
> @@ -73,11 +73,6 @@
>   /* Until my rework is finished, 40x still needs atomic PTE updates */
>   #define PTE_ATOMIC_UPDATES	1
>   
> -/* Mask of bits returned by pte_pgprot() */
> -#define PAGE_PROT_BITS	(_PAGE_GUARDED | _PAGE_NO_CACHE | \
> -			 _PAGE_WRITETHRU | _PAGE_USER | _PAGE_ACCESSED | \
> -			 _PAGE_RW | _PAGE_HWWRITE | _PAGE_DIRTY | _PAGE_EXEC)
> -
>   #define _PAGE_BASE_NC	(_PAGE_PRESENT | _PAGE_ACCESSED)
>   #define _PAGE_BASE	(_PAGE_BASE_NC)
>   
> diff --git a/arch/powerpc/include/asm/nohash/32/pte-44x.h b/arch/powerpc/include/asm/nohash/32/pte-44x.h
> index 8d6b268a986f..78bc304f750e 100644
> --- a/arch/powerpc/include/asm/nohash/32/pte-44x.h
> +++ b/arch/powerpc/include/asm/nohash/32/pte-44x.h
> @@ -93,11 +93,6 @@
>   #define _PAGE_KERNEL_RW		(_PAGE_DIRTY | _PAGE_RW)
>   #define _PAGE_KERNEL_RWX	(_PAGE_DIRTY | _PAGE_RW | _PAGE_EXEC)
>   
> -/* Mask of bits returned by pte_pgprot() */
> -#define PAGE_PROT_BITS	(_PAGE_GUARDED | _PAGE_COHERENT | _PAGE_NO_CACHE | \
> -			 _PAGE_WRITETHRU | _PAGE_USER | _PAGE_ACCESSED | \
> -			 _PAGE_RW | _PAGE_DIRTY | _PAGE_EXEC)
> -
>   /* TODO: Add large page lowmem mapping support */
>   #define _PMD_PRESENT	0
>   #define _PMD_PRESENT_MASK (PAGE_MASK)
> diff --git a/arch/powerpc/include/asm/nohash/32/pte-8xx.h b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
> index 1c57efac089d..6bfe041ef59d 100644
> --- a/arch/powerpc/include/asm/nohash/32/pte-8xx.h
> +++ b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
> @@ -55,11 +55,6 @@
>   #define _PAGE_KERNEL_RW		(_PAGE_SH | _PAGE_DIRTY)
>   #define _PAGE_KERNEL_RWX	(_PAGE_SH | _PAGE_DIRTY | _PAGE_EXEC)
>   
> -/* Mask of bits returned by pte_pgprot() */
> -#define PAGE_PROT_BITS	(_PAGE_GUARDED | _PAGE_NO_CACHE | \
> -			 _PAGE_ACCESSED | _PAGE_RO | _PAGE_NA | \
> -			 _PAGE_SH | _PAGE_DIRTY | _PAGE_EXEC)
> -
>   #define _PMD_PRESENT	0x0001
>   #define _PMD_PRESENT_MASK	_PMD_PRESENT
>   #define _PMD_BAD	0x0fd0
> diff --git a/arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h b/arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h
> index 1ecf60fe0909..0fc1bd42bb3e 100644
> --- a/arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h
> +++ b/arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h
> @@ -39,11 +39,6 @@
>   /* No page size encoding in the linux PTE */
>   #define _PAGE_PSIZE		0
>   
> -/* Mask of bits returned by pte_pgprot() */
> -#define PAGE_PROT_BITS	(_PAGE_GUARDED | _PAGE_COHERENT | _PAGE_NO_CACHE | \
> -			 _PAGE_WRITETHRU | _PAGE_USER | _PAGE_ACCESSED | \
> -			 _PAGE_RW | _PAGE_DIRTY | _PAGE_EXEC)
> -
>   #define _PMD_PRESENT	0
>   #define _PMD_PRESENT_MASK (PAGE_MASK)
>   #define _PMD_BAD	(~PAGE_MASK)
> diff --git a/arch/powerpc/include/asm/nohash/pgtable.h b/arch/powerpc/include/asm/nohash/pgtable.h
> index 04e9f0922ad4..70ff23974b59 100644
> --- a/arch/powerpc/include/asm/nohash/pgtable.h
> +++ b/arch/powerpc/include/asm/nohash/pgtable.h
> @@ -52,7 +52,6 @@ static inline int pte_none(pte_t pte)		{ return (pte_val(pte) & ~_PTE_NONE_MASK)
>   static inline bool pte_hashpte(pte_t pte)	{ return false; }
>   static inline bool pte_ci(pte_t pte)		{ return pte_val(pte) & _PAGE_NO_CACHE; }
>   static inline bool pte_exec(pte_t pte)		{ return pte_val(pte) & _PAGE_EXEC; }
> -static inline pgprot_t pte_pgprot(pte_t pte)	{ return __pgprot(pte_val(pte) & PAGE_PROT_BITS); }
>   
>   #ifdef CONFIG_NUMA_BALANCING
>   /*
> diff --git a/arch/powerpc/include/asm/nohash/pte-book3e.h b/arch/powerpc/include/asm/nohash/pte-book3e.h
> index 58eef8cb569d..f95ab6eaf441 100644
> --- a/arch/powerpc/include/asm/nohash/pte-book3e.h
> +++ b/arch/powerpc/include/asm/nohash/pte-book3e.h
> @@ -82,11 +82,6 @@
>   #define _PTE_NONE_MASK	0
>   #endif
>   
> -/* Mask of bits returned by pte_pgprot() */
> -#define PAGE_PROT_BITS	(_PAGE_GUARDED | _PAGE_COHERENT | _PAGE_NO_CACHE | \
> -			 _PAGE_WRITETHRU | _PAGE_USER | _PAGE_ACCESSED | \
> -			 _PAGE_PRIVILEGED | _PAGE_RW | _PAGE_DIRTY | _PAGE_EXEC)
> -
>   /*
>    * We define 2 sets of base prot bits, one for basic pages (ie,
>    * cacheable kernel and user pages) and one for non cacheable
> diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
> index fb4b85bba110..9679b7519a35 100644
> --- a/arch/powerpc/include/asm/pgtable.h
> +++ b/arch/powerpc/include/asm/pgtable.h
> @@ -46,6 +46,16 @@ struct mm_struct;
>   /* Keep these as a macros to avoid include dependency mess */
>   #define pte_page(x)		pfn_to_page(pte_pfn(x))
>   #define mk_pte(page, pgprot)	pfn_pte(page_to_pfn(page), (pgprot))
> +/*
> + * Select all bits except the pfn
> + */
> +static inline pgprot_t pte_pgprot(pte_t pte)
> +{
> +	unsigned long pte_flags;
> +
> +	pte_flags = pte_val(pte) & ~PTE_RPN_MASK;
> +	return __pgprot(pte_flags);
> +}
>   
>   /*
>    * ZERO_PAGE is a global shared page that is always zero: used
>
Michael Ellerman Oct. 17, 2018, 10:32 a.m. | #5
Christophe Leroy <christophe.leroy@c-s.fr> writes:
> On 10/17/2018 12:59 AM, Michael Ellerman wrote:
...
>> The question is what's the right way to fix it? Should pte_pgprot() not
>> be filtering those bits out on book3e?
>
> I think we should not use pte_pggrot() for that then. What about the 
> below fix ?

Thanks, that almost works.

pte_mkprivileged() also needs to not strip _PAGE_BAP_SR.


But there's also a use of pte_pgprot() in mm/memory.c, and I think that
is also broken now that we don't add PAGE_KERNEL back in.

Aneesh is going to do a patch to make pte_pgprot() only mask the PFN
which is what other arches do.

cheers

> From: Christophe Leroy <christophe.leroy@c-s.fr>
> Date: Wed, 17 Oct 2018 05:56:25 +0000
> Subject: [PATCH] powerpc/mm: don't use pte_pgprot() in ioremap_prot()
>
> pte_pgprot() filters out some required flags like _PAGE_PRESENT.
>
> This patch replaces pte_pgprot() by __pgprot(pte_val())
> in ioremap_prot()
>
> Fixes: 26973fa5ac0e ("powerpc/mm: use pte helpers in generic code")
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> ---
>   arch/powerpc/mm/pgtable_32.c | 3 ++-
>   arch/powerpc/mm/pgtable_64.c | 4 ++--
>   2 files changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
> index 5877f5aa8f5d..a606e2f4937b 100644
> --- a/arch/powerpc/mm/pgtable_32.c
> +++ b/arch/powerpc/mm/pgtable_32.c
> @@ -122,7 +122,8 @@ ioremap_prot(phys_addr_t addr, unsigned long size, 
> unsigned long flags)
>   	pte = pte_exprotect(pte);
>   	pte = pte_mkprivileged(pte);
>
> -	return __ioremap_caller(addr, size, pte_pgprot(pte), 
> __builtin_return_address(0));
> +	return __ioremap_caller(addr, size, __pgprot(pte_val(pte)),
> +				__builtin_return_address(0));
>   }
>   EXPORT_SYMBOL(ioremap_prot);
>
> diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
> index fb1375c07e8c..836bf436cabb 100644
> --- a/arch/powerpc/mm/pgtable_64.c
> +++ b/arch/powerpc/mm/pgtable_64.c
> @@ -245,8 +245,8 @@ void __iomem * ioremap_prot(phys_addr_t addr, 
> unsigned long size,
>   	pte = pte_mkprivileged(pte);
>
>   	if (ppc_md.ioremap)
> -		return ppc_md.ioremap(addr, size, pte_pgprot(pte), caller);
> -	return __ioremap_caller(addr, size, pte_pgprot(pte), caller);
> +		return ppc_md.ioremap(addr, size, __pgprot(pte_val(pte)), caller);
> +	return __ioremap_caller(addr, size, __pgprot(pte_val(pte)), caller);
>   }
>
>
> -- 
> 2.13.3
Christophe Leroy Oct. 17, 2018, 11:12 a.m. | #6
On 10/17/2018 10:32 AM, Michael Ellerman wrote:
> Christophe Leroy <christophe.leroy@c-s.fr> writes:
>> On 10/17/2018 12:59 AM, Michael Ellerman wrote:
> ...
>>> The question is what's the right way to fix it? Should pte_pgprot() not
>>> be filtering those bits out on book3e?
>>
>> I think we should not use pte_pggrot() for that then. What about the
>> below fix ?
> 
> Thanks, that almost works.
> 
> pte_mkprivileged() also needs to not strip _PAGE_BAP_SR.

Oops, I missed it allthough I knew it. Patch below.

From: Christophe Leroy <christophe.leroy@c-s.fr>
Date: Wed, 17 Oct 2018 10:46:24 +0000
Subject: [PATCH] powerpc/book3e: redefine pte_mkprivileged() and 
pte_mkuser()
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>, Paul Mackerras 
<paulus@samba.org>, Michael Ellerman <mpe@ellerman.id.au>
Cc: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org

Book3e defines both _PAGE_USER and _PAGE_PRIVILEGED, so the nohash
default pte_mkprivileged() and pte_mkuser() are not usable.

This patch redefines them for book3e.

Fixes: a0da4bc166f2 ("powerpc/mm: Allow platforms to redefine some helpers")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
  arch/powerpc/include/asm/nohash/pte-book3e.h | 14 ++++++++++++++
  1 file changed, 14 insertions(+)

diff --git a/arch/powerpc/include/asm/nohash/pte-book3e.h 
b/arch/powerpc/include/asm/nohash/pte-book3e.h
index 58eef8cb569d..fb4297dff3e2 100644
--- a/arch/powerpc/include/asm/nohash/pte-book3e.h
+++ b/arch/powerpc/include/asm/nohash/pte-book3e.h
@@ -109,5 +109,19 @@
  #define PAGE_READONLY	__pgprot(_PAGE_BASE | _PAGE_USER)
  #define PAGE_READONLY_X	__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_EXEC)

+static inline pte_t pte_mkprivileged(pte_t pte)
+{
+	return __pte((pte_val(pte) & ~_PAGE_USER) | _PAGE_PRIVILEGED);
+}
+
+#define pte_mkprivileged pte_mkprivileged
+
+static inline pte_t pte_mkuser(pte_t pte)
+{
+	return __pte((pte_val(pte) & ~_PAGE_PRIVILEGED) | _PAGE_USER);
+}
+
+#define pte_mkuser pte_mkuser
+
  #endif /* __KERNEL__ */
  #endif /*  _ASM_POWERPC_NOHASH_PTE_BOOK3E_H */
Aneesh Kumar K.V Oct. 17, 2018, 11:53 a.m. | #7
On 10/17/18 4:42 PM, Christophe Leroy wrote:
> 
> 
> On 10/17/2018 10:32 AM, Michael Ellerman wrote:
>> Christophe Leroy <christophe.leroy@c-s.fr> writes:
>>> On 10/17/2018 12:59 AM, Michael Ellerman wrote:
>> ...
>>>> The question is what's the right way to fix it? Should pte_pgprot() not
>>>> be filtering those bits out on book3e?
>>>
>>> I think we should not use pte_pggrot() for that then. What about the
>>> below fix ?
>>
>> Thanks, that almost works.
>>
>> pte_mkprivileged() also needs to not strip _PAGE_BAP_SR.
> 
> Oops, I missed it allthough I knew it. Patch below.
> 
> From: Christophe Leroy <christophe.leroy@c-s.fr>
> Date: Wed, 17 Oct 2018 10:46:24 +0000
> Subject: [PATCH] powerpc/book3e: redefine pte_mkprivileged() and 
> pte_mkuser()
> To: Benjamin Herrenschmidt <benh@kernel.crashing.org>, Paul Mackerras 
> <paulus@samba.org>, Michael Ellerman <mpe@ellerman.id.au>
> Cc: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org
> 
> Book3e defines both _PAGE_USER and _PAGE_PRIVILEGED, so the nohash
> default pte_mkprivileged() and pte_mkuser() are not usable.
> 
> This patch redefines them for book3e.
> 
> Fixes: a0da4bc166f2 ("powerpc/mm: Allow platforms to redefine some 
> helpers")
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> ---
>   arch/powerpc/include/asm/nohash/pte-book3e.h | 14 ++++++++++++++
>   1 file changed, 14 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/nohash/pte-book3e.h 
> b/arch/powerpc/include/asm/nohash/pte-book3e.h
> index 58eef8cb569d..fb4297dff3e2 100644
> --- a/arch/powerpc/include/asm/nohash/pte-book3e.h
> +++ b/arch/powerpc/include/asm/nohash/pte-book3e.h
> @@ -109,5 +109,19 @@
>   #define PAGE_READONLY    __pgprot(_PAGE_BASE | _PAGE_USER)
>   #define PAGE_READONLY_X    __pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_EXEC)
> 
> +static inline pte_t pte_mkprivileged(pte_t pte)
> +{
> +    return __pte((pte_val(pte) & ~_PAGE_USER) | _PAGE_PRIVILEGED);
> +}
> +
> +#define pte_mkprivileged pte_mkprivileged
> +
> +static inline pte_t pte_mkuser(pte_t pte)
> +{
> +    return __pte((pte_val(pte) & ~_PAGE_PRIVILEGED) | _PAGE_USER);
> +}
> +
> +#define pte_mkuser pte_mkuser
> +

I was build testing a similar patch. We would need to put #ifndef 
__ASSEMBLY__ around it.



-aneesh

Patch

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h
index a6ca799e0eb5..a0dc3a3eef33 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -331,17 +331,14 @@  static inline bool pte_ci(pte_t pte)
 #define pte_access_permitted pte_access_permitted
 static inline bool pte_access_permitted(pte_t pte, bool write)
 {
-	unsigned long pteval = pte_val(pte);
 	/*
 	 * A read-only access is controlled by _PAGE_USER bit.
 	 * We have _PAGE_READ set for WRITE and EXECUTE
 	 */
-	unsigned long need_pte_bits = _PAGE_PRESENT | _PAGE_USER;
-
-	if (write)
-		need_pte_bits |= _PAGE_WRITE;
+	if (!pte_present(pte) || !pte_user(pte) || !pte_read(pte))
+		return false;
 
-	if ((pteval & need_pte_bits) != need_pte_bits)
+	if (write && !pte_write(pte))
 		return false;
 
 	return true;
diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h
index 6fecfd7854f5..a4156da4a7a4 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -277,7 +277,10 @@  static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr,
 static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr,
 				      pte_t *ptep)
 {
-	pte_update(ptep, (_PAGE_RW | _PAGE_HWWRITE), _PAGE_RO);
+	unsigned long clr = ~pte_val(pte_wrprotect(__pte(~0)));
+	unsigned long set = pte_val(pte_wrprotect(__pte(0)));
+
+	pte_update(ptep, clr, set);
 }
 static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
 					   unsigned long addr, pte_t *ptep)
@@ -291,9 +294,10 @@  static inline void __ptep_set_access_flags(struct vm_area_struct *vma,
 					   unsigned long address,
 					   int psize)
 {
-	unsigned long set = pte_val(entry) &
-		(_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW | _PAGE_EXEC);
-	unsigned long clr = ~pte_val(entry) & (_PAGE_RO | _PAGE_NA);
+	pte_t pte_set = pte_mkyoung(pte_mkdirty(pte_mkwrite(pte_mkexec(__pte(0)))));
+	pte_t pte_clr = pte_mkyoung(pte_mkdirty(pte_mkwrite(pte_mkexec(__pte(~0)))));
+	unsigned long set = pte_val(entry) & pte_val(pte_set);
+	unsigned long clr = ~pte_val(entry) & ~pte_val(pte_clr);
 
 	pte_update(ptep, clr, set);
 
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h b/arch/powerpc/include/asm/nohash/pgtable.h
index b256e38a047c..062d96233673 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -32,8 +32,7 @@  static inline pgprot_t pte_pgprot(pte_t pte)	{ return __pgprot(pte_val(pte) & PA
  */
 static inline int pte_protnone(pte_t pte)
 {
-	return (pte_val(pte) &
-		(_PAGE_PRESENT | _PAGE_USER)) == _PAGE_PRESENT;
+	return pte_present(pte) && !pte_user(pte);
 }
 
 static inline int pmd_protnone(pmd_t pmd)
diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
index f97d9c3760e3..ca4b1f7ac39d 100644
--- a/arch/powerpc/mm/pgtable.c
+++ b/arch/powerpc/mm/pgtable.c
@@ -44,20 +44,13 @@  static inline int is_exec_fault(void)
 static inline int pte_looks_normal(pte_t pte)
 {
 
-#if defined(CONFIG_PPC_BOOK3S_64)
-	if ((pte_val(pte) & (_PAGE_PRESENT | _PAGE_SPECIAL)) == _PAGE_PRESENT) {
+	if (pte_present(pte) && !pte_special(pte)) {
 		if (pte_ci(pte))
 			return 0;
 		if (pte_user(pte))
 			return 1;
 	}
 	return 0;
-#else
-	return (pte_val(pte) &
-		(_PAGE_PRESENT | _PAGE_SPECIAL | _PAGE_NO_CACHE | _PAGE_USER |
-		 _PAGE_PRIVILEGED)) ==
-		(_PAGE_PRESENT | _PAGE_USER);
-#endif
 }
 
 static struct page *maybe_pte_to_page(pte_t pte)
@@ -117,7 +110,7 @@  static pte_t set_pte_filter(pte_t pte)
 	struct page *pg;
 
 	/* No exec permission in the first place, move on */
-	if (!(pte_val(pte) & _PAGE_EXEC) || !pte_looks_normal(pte))
+	if (!pte_exec(pte) || !pte_looks_normal(pte))
 		return pte;
 
 	/* If you set _PAGE_EXEC on weird pages you're on your own */
@@ -137,7 +130,7 @@  static pte_t set_pte_filter(pte_t pte)
 	}
 
 	/* Else, we filter out _PAGE_EXEC */
-	return __pte(pte_val(pte) & ~_PAGE_EXEC);
+	return pte_exprotect(pte);
 }
 
 static pte_t set_access_flags_filter(pte_t pte, struct vm_area_struct *vma,
@@ -150,7 +143,7 @@  static pte_t set_access_flags_filter(pte_t pte, struct vm_area_struct *vma,
 	 * if necessary. Also if _PAGE_EXEC is already set, same deal,
 	 * we just bail out
 	 */
-	if (dirty || (pte_val(pte) & _PAGE_EXEC) || !is_exec_fault())
+	if (dirty || pte_exec(pte) || !is_exec_fault())
 		return pte;
 
 #ifdef CONFIG_DEBUG_VM
@@ -176,7 +169,7 @@  static pte_t set_access_flags_filter(pte_t pte, struct vm_area_struct *vma,
 	set_bit(PG_arch_1, &pg->flags);
 
  bail:
-	return __pte(pte_val(pte) | _PAGE_EXEC);
+	return pte_mkexec(pte);
 }
 
 #endif /* CONFIG_PPC_BOOK3S */
@@ -191,10 +184,10 @@  void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
 	 * Make sure hardware valid bit is not set. We don't do
 	 * tlb flush for this update.
 	 */
-	VM_WARN_ON(pte_val(*ptep) & _PAGE_PRESENT);
+	VM_WARN_ON(pte_hw_valid(*ptep));
 
 	/* Add the pte bit when trying to set a pte */
-	pte = __pte(pte_val(pte) | _PAGE_PTE);
+	pte = pte_mkpte(pte);
 
 	/* Note: mm->context.id might not yet have been assigned as
 	 * this context might not have been activated yet when this
diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
index 01f348938328..5877f5aa8f5d 100644
--- a/arch/powerpc/mm/pgtable_32.c
+++ b/arch/powerpc/mm/pgtable_32.c
@@ -112,15 +112,17 @@  EXPORT_SYMBOL(ioremap_coherent);
 void __iomem *
 ioremap_prot(phys_addr_t addr, unsigned long size, unsigned long flags)
 {
+	pte_t pte = __pte(flags);
+
 	/* writeable implies dirty for kernel addresses */
-	if ((flags & (_PAGE_RW | _PAGE_RO)) != _PAGE_RO)
-		flags |= _PAGE_DIRTY | _PAGE_HWWRITE;
+	if (pte_write(pte))
+		pte = pte_mkdirty(pte);
 
 	/* we don't want to let _PAGE_USER and _PAGE_EXEC leak out */
-	flags &= ~(_PAGE_USER | _PAGE_EXEC);
-	flags |= _PAGE_PRIVILEGED;
+	pte = pte_exprotect(pte);
+	pte = pte_mkprivileged(pte);
 
-	return __ioremap_caller(addr, size, __pgprot(flags), __builtin_return_address(0));
+	return __ioremap_caller(addr, size, pte_pgprot(pte), __builtin_return_address(0));
 }
 EXPORT_SYMBOL(ioremap_prot);
 
@@ -235,8 +237,7 @@  int map_kernel_page(unsigned long va, phys_addr_t pa, pgprot_t prot)
 		/* The PTE should never be already set nor present in the
 		 * hash table
 		 */
-		BUG_ON((pte_val(*pg) & (_PAGE_PRESENT | _PAGE_HASHPTE)) &&
-		       pgprot_val(prot));
+		BUG_ON((pte_present(*pg) | pte_hashpte(*pg)) && pgprot_val(prot));
 		set_pte_at(&init_mm, va, pg, pfn_pte(pa >> PAGE_SHIFT, prot));
 	}
 	smp_wmb();
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index b0f4a4b4f62b..fb1375c07e8c 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -230,23 +230,23 @@  void __iomem *ioremap_coherent(phys_addr_t addr, unsigned long size)
 void __iomem * ioremap_prot(phys_addr_t addr, unsigned long size,
 			     unsigned long flags)
 {
+	pte_t pte = __pte(flags);
 	void *caller = __builtin_return_address(0);
 
 	/* writeable implies dirty for kernel addresses */
-	if (flags & _PAGE_WRITE)
-		flags |= _PAGE_DIRTY;
+	if (pte_write(pte))
+		pte = pte_mkdirty(pte);
 
 	/* we don't want to let _PAGE_EXEC leak out */
-	flags &= ~_PAGE_EXEC;
+	pte = pte_exprotect(pte);
 	/*
 	 * Force kernel mapping.
 	 */
-	flags &= ~_PAGE_USER;
-	flags |= _PAGE_PRIVILEGED;
+	pte = pte_mkprivileged(pte);
 
 	if (ppc_md.ioremap)
-		return ppc_md.ioremap(addr, size, __pgprot(flags), caller);
-	return __ioremap_caller(addr, size, __pgprot(flags), caller);
+		return ppc_md.ioremap(addr, size, pte_pgprot(pte), caller);
+	return __ioremap_caller(addr, size, pte_pgprot(pte), caller);
 }
 
 
diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index c70d17c9a6ba..167271c7a97c 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -2993,15 +2993,17 @@  static void show_task(struct task_struct *tsk)
 #ifdef CONFIG_PPC_BOOK3S_64
 void format_pte(void *ptep, unsigned long pte)
 {
+	pte_t entry = __pte(pte);
+
 	printf("ptep @ 0x%016lx = 0x%016lx\n", (unsigned long)ptep, pte);
 	printf("Maps physical address = 0x%016lx\n", pte & PTE_RPN_MASK);
 
 	printf("Flags = %s%s%s%s%s\n",
-	       (pte & _PAGE_ACCESSED) ? "Accessed " : "",
-	       (pte & _PAGE_DIRTY)    ? "Dirty " : "",
-	       (pte & _PAGE_READ)     ? "Read " : "",
-	       (pte & _PAGE_WRITE)    ? "Write " : "",
-	       (pte & _PAGE_EXEC)     ? "Exec " : "");
+	       pte_young(entry) ? "Accessed " : "",
+	       pte_dirty(entry) ? "Dirty " : "",
+	       pte_read(entry)  ? "Read " : "",
+	       pte_write(entry) ? "Write " : "",
+	       pte_exec(entry)  ? "Exec " : "");
 }
 
 static void show_pte(unsigned long addr)