diff mbox series

[v9,3/8] powerpc/kprobes: Mark newly allocated probes as RO

Message ID 20210316031741.1004850-3-jniethe5@gmail.com (mailing list archive)
State Changes Requested
Headers show
Series [v9,1/8] powerpc/mm: Implement set_memory() routines | expand

Checks

Context Check Description
snowpatch_ozlabs/apply_patch warning Failed to apply on branch powerpc/merge (0512161accb8b6f6dacc85d165350b1812ddcc33)
snowpatch_ozlabs/apply_patch warning Failed to apply on branch powerpc/next (fbda7904302499dd7ffc073a3c84eb7c9275db0a)
snowpatch_ozlabs/apply_patch warning Failed to apply on branch linus/master (1a4431a5db2bf800c647ee0ed87f2727b8d6c29c)
snowpatch_ozlabs/apply_patch warning Failed to apply on branch powerpc/fixes (eed5fae00593ab9d261a0c1ffc1bdb786a87a55a)
snowpatch_ozlabs/apply_patch warning Failed to apply on branch linux-next (1e28eed17697bcf343c6743f0028cc3b5dd88bf0)
snowpatch_ozlabs/apply_patch fail Failed to apply to any branch

Commit Message

Jordan Niethe March 16, 2021, 3:17 a.m. UTC
From: Russell Currey <ruscur@russell.cc>

With CONFIG_STRICT_KERNEL_RWX=y and CONFIG_KPROBES=y, there will be one
W+X page at boot by default.  This can be tested with
CONFIG_PPC_PTDUMP=y and CONFIG_PPC_DEBUG_WX=y set, and checking the
kernel log during boot.

Add an arch specific insn page allocator which returns RO pages if
STRICT_KERNEL_RWX is enabled. This page is only written to with
patch_instruction() which is able to write RO pages.

Reviewed-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
[jpn: Reword commit message, switch from vmalloc_exec(), add
      free_insn_page()]
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
v9: - vmalloc_exec() no longer exists
    - Set the page to RW before freeing it
---
 arch/powerpc/kernel/kprobes.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

Comments

Christophe Leroy March 16, 2021, 6:44 a.m. UTC | #1
Le 16/03/2021 à 04:17, Jordan Niethe a écrit :
> From: Russell Currey <ruscur@russell.cc>
> 
> With CONFIG_STRICT_KERNEL_RWX=y and CONFIG_KPROBES=y, there will be one
> W+X page at boot by default.  This can be tested with
> CONFIG_PPC_PTDUMP=y and CONFIG_PPC_DEBUG_WX=y set, and checking the
> kernel log during boot.
> 

This text is confusing. I don't understand what is the status before the patch, and what is the 
status after.

"there will be one ...", does it mean after the patch ?

> Add an arch specific insn page allocator which returns RO pages if
> STRICT_KERNEL_RWX is enabled. This page is only written to with
> patch_instruction() which is able to write RO pages.

"an" or "the" arch specific insn page allocator ?

> 
> Reviewed-by: Daniel Axtens <dja@axtens.net>
> Signed-off-by: Russell Currey <ruscur@russell.cc>
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> [jpn: Reword commit message, switch from vmalloc_exec(), add
>        free_insn_page()]
> Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> ---
> v9: - vmalloc_exec() no longer exists
>      - Set the page to RW before freeing it
> ---
>   arch/powerpc/kernel/kprobes.c | 22 ++++++++++++++++++++++
>   1 file changed, 22 insertions(+)
> 
> diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
> index 01ab2163659e..bb7e4d321988 100644
> --- a/arch/powerpc/kernel/kprobes.c
> +++ b/arch/powerpc/kernel/kprobes.c
> @@ -25,6 +25,8 @@
>   #include <asm/sections.h>
>   #include <asm/inst.h>
>   #include <linux/uaccess.h>
> +#include <linux/set_memory.h>
> +#include <linux/vmalloc.h>
>   
>   DEFINE_PER_CPU(struct kprobe *, current_kprobe) = NULL;
>   DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk);
> @@ -103,6 +105,26 @@ kprobe_opcode_t *kprobe_lookup_name(const char *name, unsigned int offset)
>   	return addr;
>   }
>   
> +void *alloc_insn_page(void)
> +{
> +	void *page = vmalloc(PAGE_SIZE);

Can't do that on book3s/32, see https://github.com/linuxppc/linux/commit/6ca05532 and 
https://github.com/linuxppc/linux/commit/7fbc22ce

Should do:
	return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END, GFP_KERNEL,
				    PAGE_KERNEL_ROX, VM_FLUSH_RESET_PERMS, NUMA_NO_NODE,
				    __builtin_return_address(0));


To keep it simple, you'll probably need to define MODULES_VADDR and MODULES_END as resp 
VMALLOC_START and VMALLOC_END when they are not defined, maybe in asm/pgtable.h

> +
> +	if (!page)
> +		return NULL;
> +
> +	set_memory_ro((unsigned long)page, 1);
> +	set_memory_x((unsigned long)page, 1);
> +
> +	return page;
> +}
> +
> +void free_insn_page(void *page)
> +{
> +	set_memory_nx((unsigned long)page, 1);
> +	set_memory_rw((unsigned long)page, 1);
> +	vfree(page);
> +}
> +
>   int arch_prepare_kprobe(struct kprobe *p)
>   {
>   	int ret = 0;
>
Jordan Niethe March 17, 2021, 12:50 a.m. UTC | #2
On Tue, Mar 16, 2021 at 5:44 PM Christophe Leroy
<christophe.leroy@csgroup.eu> wrote:
>
>
>
> Le 16/03/2021 à 04:17, Jordan Niethe a écrit :
> > From: Russell Currey <ruscur@russell.cc>
> >
> > With CONFIG_STRICT_KERNEL_RWX=y and CONFIG_KPROBES=y, there will be one
> > W+X page at boot by default.  This can be tested with
> > CONFIG_PPC_PTDUMP=y and CONFIG_PPC_DEBUG_WX=y set, and checking the
> > kernel log during boot.
> >
>
> This text is confusing. I don't understand what is the status before the patch, and what is the
> status after.
Before the patch kprobes is allocating W+X pages. This can be seen in
the kernel log with those debug options on.
After the patch kprobes  no longer allocate W+X pages.
I will reword it to more clear.
>
> "there will be one ...", does it mean after the patch ?
No, before, after there will be none.
>
> > Add an arch specific insn page allocator which returns RO pages if
> > STRICT_KERNEL_RWX is enabled. This page is only written to with
> > patch_instruction() which is able to write RO pages.
>
> "an" or "the" arch specific insn page allocator ?
Hmm, will go with "the arch specific insn page allocator for powerpc".
>
> >
> > Reviewed-by: Daniel Axtens <dja@axtens.net>
> > Signed-off-by: Russell Currey <ruscur@russell.cc>
> > Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> > [jpn: Reword commit message, switch from vmalloc_exec(), add
> >        free_insn_page()]
> > Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> > ---
> > v9: - vmalloc_exec() no longer exists
> >      - Set the page to RW before freeing it
> > ---
> >   arch/powerpc/kernel/kprobes.c | 22 ++++++++++++++++++++++
> >   1 file changed, 22 insertions(+)
> >
> > diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
> > index 01ab2163659e..bb7e4d321988 100644
> > --- a/arch/powerpc/kernel/kprobes.c
> > +++ b/arch/powerpc/kernel/kprobes.c
> > @@ -25,6 +25,8 @@
> >   #include <asm/sections.h>
> >   #include <asm/inst.h>
> >   #include <linux/uaccess.h>
> > +#include <linux/set_memory.h>
> > +#include <linux/vmalloc.h>
> >
> >   DEFINE_PER_CPU(struct kprobe *, current_kprobe) = NULL;
> >   DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk);
> > @@ -103,6 +105,26 @@ kprobe_opcode_t *kprobe_lookup_name(const char *name, unsigned int offset)
> >       return addr;
> >   }
> >
> > +void *alloc_insn_page(void)
> > +{
> > +     void *page = vmalloc(PAGE_SIZE);
>
> Can't do that on book3s/32, see https://github.com/linuxppc/linux/commit/6ca05532 and
> https://github.com/linuxppc/linux/commit/7fbc22ce
>
> Should do:
>         return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END, GFP_KERNEL,
>                                     PAGE_KERNEL_ROX, VM_FLUSH_RESET_PERMS, NUMA_NO_NODE,
>                                     __builtin_return_address(0));
>
>
> To keep it simple, you'll probably need to define MODULES_VADDR and MODULES_END as resp
> VMALLOC_START and VMALLOC_END when they are not defined, maybe in asm/pgtable.h
>
> > +
> > +     if (!page)
> > +             return NULL;
> > +
> > +     set_memory_ro((unsigned long)page, 1);
> > +     set_memory_x((unsigned long)page, 1);
> > +
> > +     return page;
> > +}
> > +
> > +void free_insn_page(void *page)
> > +{
> > +     set_memory_nx((unsigned long)page, 1);
> > +     set_memory_rw((unsigned long)page, 1);
> > +     vfree(page);
> > +}
> > +
> >   int arch_prepare_kprobe(struct kprobe *p)
> >   {
> >       int ret = 0;
> >
Jordan Niethe March 17, 2021, 12:52 a.m. UTC | #3
On Tue, Mar 16, 2021 at 5:44 PM Christophe Leroy
<christophe.leroy@csgroup.eu> wrote:
>
>
>
> Le 16/03/2021 à 04:17, Jordan Niethe a écrit :
> > From: Russell Currey <ruscur@russell.cc>
> >
> > With CONFIG_STRICT_KERNEL_RWX=y and CONFIG_KPROBES=y, there will be one
> > W+X page at boot by default.  This can be tested with
> > CONFIG_PPC_PTDUMP=y and CONFIG_PPC_DEBUG_WX=y set, and checking the
> > kernel log during boot.
> >
>
> This text is confusing. I don't understand what is the status before the patch, and what is the
> status after.
>
> "there will be one ...", does it mean after the patch ?
>
> > Add an arch specific insn page allocator which returns RO pages if
> > STRICT_KERNEL_RWX is enabled. This page is only written to with
> > patch_instruction() which is able to write RO pages.
>
> "an" or "the" arch specific insn page allocator ?
>
> >
> > Reviewed-by: Daniel Axtens <dja@axtens.net>
> > Signed-off-by: Russell Currey <ruscur@russell.cc>
> > Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> > [jpn: Reword commit message, switch from vmalloc_exec(), add
> >        free_insn_page()]
> > Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> > ---
> > v9: - vmalloc_exec() no longer exists
> >      - Set the page to RW before freeing it
> > ---
> >   arch/powerpc/kernel/kprobes.c | 22 ++++++++++++++++++++++
> >   1 file changed, 22 insertions(+)
> >
> > diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
> > index 01ab2163659e..bb7e4d321988 100644
> > --- a/arch/powerpc/kernel/kprobes.c
> > +++ b/arch/powerpc/kernel/kprobes.c
> > @@ -25,6 +25,8 @@
> >   #include <asm/sections.h>
> >   #include <asm/inst.h>
> >   #include <linux/uaccess.h>
> > +#include <linux/set_memory.h>
> > +#include <linux/vmalloc.h>
> >
> >   DEFINE_PER_CPU(struct kprobe *, current_kprobe) = NULL;
> >   DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk);
> > @@ -103,6 +105,26 @@ kprobe_opcode_t *kprobe_lookup_name(const char *name, unsigned int offset)
> >       return addr;
> >   }
> >
> > +void *alloc_insn_page(void)
> > +{
> > +     void *page = vmalloc(PAGE_SIZE);
>
> Can't do that on book3s/32, see https://github.com/linuxppc/linux/commit/6ca05532 and
> https://github.com/linuxppc/linux/commit/7fbc22ce
>
> Should do:
>         return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END, GFP_KERNEL,
>                                     PAGE_KERNEL_ROX, VM_FLUSH_RESET_PERMS, NUMA_NO_NODE,
>                                     __builtin_return_address(0));
>
>
> To keep it simple, you'll probably need to define MODULES_VADDR and MODULES_END as resp
> VMALLOC_START and VMALLOC_END when they are not defined, maybe in asm/pgtable.h
Thank you, I had overlooked that. I will do it like that in the next revision.
>
> > +
> > +     if (!page)
> > +             return NULL;
> > +
> > +     set_memory_ro((unsigned long)page, 1);
> > +     set_memory_x((unsigned long)page, 1);
> > +
> > +     return page;
> > +}
> > +
> > +void free_insn_page(void *page)
> > +{
> > +     set_memory_nx((unsigned long)page, 1);
> > +     set_memory_rw((unsigned long)page, 1);
> > +     vfree(page);
> > +}
> > +
> >   int arch_prepare_kprobe(struct kprobe *p)
> >   {
> >       int ret = 0;
> >
Christophe Leroy March 17, 2021, 6:12 a.m. UTC | #4
Le 16/03/2021 à 04:17, Jordan Niethe a écrit :
> From: Russell Currey <ruscur@russell.cc>
> 
> With CONFIG_STRICT_KERNEL_RWX=y and CONFIG_KPROBES=y, there will be one
> W+X page at boot by default.  This can be tested with
> CONFIG_PPC_PTDUMP=y and CONFIG_PPC_DEBUG_WX=y set, and checking the
> kernel log during boot.
> 
> Add an arch specific insn page allocator which returns RO pages if
> STRICT_KERNEL_RWX is enabled. This page is only written to with
> patch_instruction() which is able to write RO pages.
> 

Did you investigate BPF ? The problematic looks more or less similar to kprobe:

bpf_jit_compile() in arch/powerpc/net/bpf_jit_comp.c calls module_alloc(), which provides it with 
PAGE_KERNEL_TEXT memory, ie RWX. That function is only used on PPC32 which still has Classic BPF, 
and this is about to go away with future series 
https://patchwork.ozlabs.org/project/linuxppc-dev/cover/cover.1608112796.git.christophe.leroy@csgroup.eu/

PPC64 has Extended BPF instead, and PPC32 will it the future too.
bpf_int_jit_compile() in arch/powerpc/net/bpf_jit_comp64.c calls bpf_jit_binary_alloc() which uses 
bpf_jit_alloc_exec().

bpf_jit_alloc_exec() is a weak function that should be redefined for powerpc I think, more or less 
like alloc_insn_page() for kprobes.

Christophe
Jordan Niethe March 18, 2021, 2:42 a.m. UTC | #5
On Wed, Mar 17, 2021 at 5:12 PM Christophe Leroy
<christophe.leroy@csgroup.eu> wrote:
>
>
>
> Le 16/03/2021 à 04:17, Jordan Niethe a écrit :
> > From: Russell Currey <ruscur@russell.cc>
> >
> > With CONFIG_STRICT_KERNEL_RWX=y and CONFIG_KPROBES=y, there will be one
> > W+X page at boot by default.  This can be tested with
> > CONFIG_PPC_PTDUMP=y and CONFIG_PPC_DEBUG_WX=y set, and checking the
> > kernel log during boot.
> >
> > Add an arch specific insn page allocator which returns RO pages if
> > STRICT_KERNEL_RWX is enabled. This page is only written to with
> > patch_instruction() which is able to write RO pages.
> >
>
> Did you investigate BPF ? The problematic looks more or less similar to kprobe:
>
> bpf_jit_compile() in arch/powerpc/net/bpf_jit_comp.c calls module_alloc(), which provides it with
> PAGE_KERNEL_TEXT memory, ie RWX. That function is only used on PPC32 which still has Classic BPF,
> and this is about to go away with future series
> https://patchwork.ozlabs.org/project/linuxppc-dev/cover/cover.1608112796.git.christophe.leroy@csgroup.eu/
>
> PPC64 has Extended BPF instead, and PPC32 will it the future too.
> bpf_int_jit_compile() in arch/powerpc/net/bpf_jit_comp64.c calls bpf_jit_binary_alloc() which uses
> bpf_jit_alloc_exec().
>
> bpf_jit_alloc_exec() is a weak function that should be redefined for powerpc I think, more or less
> like alloc_insn_page() for kprobes.
Thanks, that is a good point. I will handle bpf with the next revision.
>
> Christophe
diff mbox series

Patch

diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
index 01ab2163659e..bb7e4d321988 100644
--- a/arch/powerpc/kernel/kprobes.c
+++ b/arch/powerpc/kernel/kprobes.c
@@ -25,6 +25,8 @@ 
 #include <asm/sections.h>
 #include <asm/inst.h>
 #include <linux/uaccess.h>
+#include <linux/set_memory.h>
+#include <linux/vmalloc.h>
 
 DEFINE_PER_CPU(struct kprobe *, current_kprobe) = NULL;
 DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk);
@@ -103,6 +105,26 @@  kprobe_opcode_t *kprobe_lookup_name(const char *name, unsigned int offset)
 	return addr;
 }
 
+void *alloc_insn_page(void)
+{
+	void *page = vmalloc(PAGE_SIZE);
+
+	if (!page)
+		return NULL;
+
+	set_memory_ro((unsigned long)page, 1);
+	set_memory_x((unsigned long)page, 1);
+
+	return page;
+}
+
+void free_insn_page(void *page)
+{
+	set_memory_nx((unsigned long)page, 1);
+	set_memory_rw((unsigned long)page, 1);
+	vfree(page);
+}
+
 int arch_prepare_kprobe(struct kprobe *p)
 {
 	int ret = 0;