diff mbox series

[v4,1/2] powerpc/mm: Support execute-only memory on the Radix MMU

Message ID 20220817050640.406017-1-ruscur@russell.cc (mailing list archive)
State Accepted
Headers show
Series [v4,1/2] powerpc/mm: Support execute-only memory on the Radix MMU | expand

Commit Message

Russell Currey Aug. 17, 2022, 5:06 a.m. UTC
Add support for execute-only memory (XOM) for the Radix MMU by using an
execute-only mapping, as opposed to the RX mapping used by powerpc's
other MMUs.

The Hash MMU already supports XOM through the execute-only pkey,
which is a separate mechanism shared with x86.  A PROT_EXEC-only mapping
will map to RX, and then the pkey will be applied on top of it.

mmap() and mprotect() consumers in userspace should observe the same
behaviour on Hash and Radix despite the differences in implementation.

Replacing the vma_is_accessible() check in access_error() with a read
check should be functionally equivalent for non-Radix MMUs, since it
follows write and execute checks.  For Radix, the change enables
detecting faults on execute-only mappings where vma_is_accessible() would
return true.

Signed-off-by: Russell Currey <ruscur@russell.cc>
---
v4: Reword commit message, add changes suggested by Christophe and Aneesh

 arch/powerpc/include/asm/book3s/64/pgtable.h |  2 ++
 arch/powerpc/mm/book3s64/pgtable.c           | 11 +++++++++--
 arch/powerpc/mm/fault.c                      |  6 +++++-
 3 files changed, 16 insertions(+), 3 deletions(-)

Comments

Michael Ellerman Aug. 31, 2022, 1:13 p.m. UTC | #1
On Wed, 17 Aug 2022 15:06:39 +1000, Russell Currey wrote:
> Add support for execute-only memory (XOM) for the Radix MMU by using an
> execute-only mapping, as opposed to the RX mapping used by powerpc's
> other MMUs.
> 
> The Hash MMU already supports XOM through the execute-only pkey,
> which is a separate mechanism shared with x86.  A PROT_EXEC-only mapping
> will map to RX, and then the pkey will be applied on top of it.
> 
> [...]

Applied to powerpc/next.

[1/2] powerpc/mm: Support execute-only memory on the Radix MMU
      https://git.kernel.org/powerpc/c/395cac7752b905318ae454a8b859d4c190485510
[2/2] selftests/powerpc: Add a test for execute-only memory
      https://git.kernel.org/powerpc/c/98acee3f8db451eaab9fbd422e523c228aacf08c

cheers
Michal Suchánek March 8, 2023, 3:27 p.m. UTC | #2
Hello,

On Wed, Aug 31, 2022 at 11:13:59PM +1000, Michael Ellerman wrote:
> On Wed, 17 Aug 2022 15:06:39 +1000, Russell Currey wrote:
> > Add support for execute-only memory (XOM) for the Radix MMU by using an
> > execute-only mapping, as opposed to the RX mapping used by powerpc's
> > other MMUs.
> > 
> > The Hash MMU already supports XOM through the execute-only pkey,
> > which is a separate mechanism shared with x86.  A PROT_EXEC-only mapping
> > will map to RX, and then the pkey will be applied on top of it.
> > 
> > [...]
> 
> Applied to powerpc/next.
> 
> [1/2] powerpc/mm: Support execute-only memory on the Radix MMU
>       https://git.kernel.org/powerpc/c/395cac7752b905318ae454a8b859d4c190485510

This breaks libaio tests (on POWER9 hash PowerVM):
https://pagure.io/libaio/blob/master/f/harness/cases/5.t#_43

cases/5.p
expect   512: (w), res =   512 [Success]
expect   512: (r), res =   512 [Success]
expect   512: (r), res =   512 [Success]
expect   512: (w), res =   512 [Success]
expect   512: (w), res =   512 [Success]
expect   -14: (r), res =   -14 [Bad address]
expect   512: (r), res =   512 [Success]
expect   512: (w), res =   512 [Success]
test cases/5.t completed PASSED.

cases/5.p
expect   512: (w), res =   512 [Success]
expect   512: (r), res =   512 [Success]
expect   512: (r), res =   512 [Success]
expect   512: (w), res =   512 [Success]
expect   512: (w), res =   512 [Success]
expect   -14: (r), res =   -14 [Bad address]
expect   512: (r), res =   512 [Success]
expect   -14: (w), res =   512 [Success] -- FAILED
test cases/5.t completed FAILED.

Can you have a look if that test assumption is OK?

Thanks

Michal
Russell Currey March 9, 2023, 12:05 a.m. UTC | #3
On Wed, 2023-03-08 at 16:27 +0100, Michal Suchánek wrote:
> Hello,
> 
> On Wed, Aug 31, 2022 at 11:13:59PM +1000, Michael Ellerman wrote:
> > On Wed, 17 Aug 2022 15:06:39 +1000, Russell Currey wrote:
> > > Add support for execute-only memory (XOM) for the Radix MMU by
> > > using an
> > > execute-only mapping, as opposed to the RX mapping used by
> > > powerpc's
> > > other MMUs.
> > > 
> > > The Hash MMU already supports XOM through the execute-only pkey,
> > > which is a separate mechanism shared with x86.  A PROT_EXEC-only
> > > mapping
> > > will map to RX, and then the pkey will be applied on top of it.
> > > 
> > > [...]
> > 
> > Applied to powerpc/next.
> > 
> > [1/2] powerpc/mm: Support execute-only memory on the Radix MMU
> >      
> > https://git.kernel.org/powerpc/c/395cac7752b905318ae454a8b859d4c190485510
> 
> This breaks libaio tests (on POWER9 hash PowerVM):
> https://pagure.io/libaio/blob/master/f/harness/cases/5.t#_43
> 
> cases/5.p
> expect   512: (w), res =   512 [Success]
> expect   512: (r), res =   512 [Success]
> expect   512: (r), res =   512 [Success]
> expect   512: (w), res =   512 [Success]
> expect   512: (w), res =   512 [Success]
> expect   -14: (r), res =   -14 [Bad address]
> expect   512: (r), res =   512 [Success]
> expect   512: (w), res =   512 [Success]
> test cases/5.t completed PASSED.
> 
> cases/5.p
> expect   512: (w), res =   512 [Success]
> expect   512: (r), res =   512 [Success]
> expect   512: (r), res =   512 [Success]
> expect   512: (w), res =   512 [Success]
> expect   512: (w), res =   512 [Success]
> expect   -14: (r), res =   -14 [Bad address]
> expect   512: (r), res =   512 [Success]
> expect   -14: (w), res =   512 [Success] -- FAILED
> test cases/5.t completed FAILED.
> 
> Can you have a look if that test assumption is OK?

Hi Michal, thanks for the report.

This wasn't an intended behaviour change, so it is a bug.  I have no
idea why we hit the fault in write() but not in io_submit(), though. 
The same issue applies under Radix.

What's happening here is that we're taking a page fault and calling
into access_error() and returning true when we shouldn't.  Previously
we didn't check for read faults and only checked for PROT_NONE.  My
patch checks the vma flags to see if they lack VM_READ after we check
for exec and write, which ignores that VM_WRITE implies read 

This means we're mishandling faults for write-only mappings by assuming
that the lack of VM_READ means we're faulting from read, when that
should only be possible under a PROT_EXEC-only mapping.

I think the correct behaviour is

	if (unlikely(!(vma->vm_flags & (VM_READ | VM_WRITE))))

in access_error().

Will do some more testing and send a patch soon.  I also need to verify
that write implying read is true for all powerpc platforms.

- Russell

> 
> Thanks
> 
> Michal
diff mbox series

Patch

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 392ff48f77df..486902aff040 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -151,6 +151,8 @@ 
 #define PAGE_COPY_X	__pgprot(_PAGE_BASE | _PAGE_READ | _PAGE_EXEC)
 #define PAGE_READONLY	__pgprot(_PAGE_BASE | _PAGE_READ)
 #define PAGE_READONLY_X	__pgprot(_PAGE_BASE | _PAGE_READ | _PAGE_EXEC)
+/* Radix only, Hash uses PAGE_READONLY_X + execute-only pkey instead */
+#define PAGE_EXECONLY	__pgprot(_PAGE_BASE | _PAGE_EXEC)
 
 /* Permission masks used for kernel mappings */
 #define PAGE_KERNEL	__pgprot(_PAGE_BASE | _PAGE_KERNEL_RW)
diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c
index 7b9966402b25..f6151a589298 100644
--- a/arch/powerpc/mm/book3s64/pgtable.c
+++ b/arch/powerpc/mm/book3s64/pgtable.c
@@ -553,8 +553,15 @@  EXPORT_SYMBOL_GPL(memremap_compat_align);
 
 pgprot_t vm_get_page_prot(unsigned long vm_flags)
 {
-	unsigned long prot = pgprot_val(protection_map[vm_flags &
-					(VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]);
+	unsigned long prot;
+
+	/* Radix supports execute-only, but protection_map maps X -> RX */
+	if (radix_enabled() && ((vm_flags & VM_ACCESS_FLAGS) == VM_EXEC)) {
+		prot = pgprot_val(PAGE_EXECONLY);
+	} else {
+		prot = pgprot_val(protection_map[vm_flags &
+						 (VM_ACCESS_FLAGS | VM_SHARED)]);
+	}
 
 	if (vm_flags & VM_SAO)
 		prot |= _PAGE_SAO;
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 014005428687..1566804e4b3d 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -270,7 +270,11 @@  static bool access_error(bool is_write, bool is_exec, struct vm_area_struct *vma
 		return false;
 	}
 
-	if (unlikely(!vma_is_accessible(vma)))
+	/*
+	 * Check for a read fault.  This could be caused by a read on an
+	 * inaccessible page (i.e. PROT_NONE), or a Radix MMU execute-only page.
+	 */
+	if (unlikely(!(vma->vm_flags & VM_READ)))
 		return true;
 	/*
 	 * We should ideally do the vma pkey access check here. But in the