diff mbox

[v2,1/2] uprobes: Pass probed vaddr to arch_uprobe_analyze_insn()

Message ID 20120608093257.GG13409@in.ibm.com (mailing list archive)
State Changes Requested, archived
Headers show

Commit Message

Ananth N Mavinakayanahalli June 8, 2012, 9:32 a.m. UTC
From: Ananth N Mavinakayanahalli <ananth@in.ibm.com>

On RISC architectures like powerpc, instructions are fixed size.
Instruction analysis on such platforms is just a matter of (insn % 4).
Pass the vaddr at which the uprobe is to be inserted so that
arch_uprobe_analyze_insn() can flag misaligned registration requests.

Changes in V2:
Pass (unsigned long)addr instead of (loff_t)vaddr to
arch_uprobe_analyze_insn(). We need the loff_t vaddr to take care of
the offset of inode:offset pair for large file sizes on 32bit.

Signed-off-by: Ananth N Mavinakaynahalli <ananth@in.ibm.com>
---
 arch/x86/include/asm/uprobes.h |    2 +-
 arch/x86/kernel/uprobes.c      |    3 ++-
 kernel/events/uprobes.c        |    2 +-
 3 files changed, 4 insertions(+), 3 deletions(-)

Comments

Oleg Nesterov June 11, 2012, 4:12 p.m. UTC | #1
Ananth, Srikar,

I think the patch is correct and I am sorry for nit-picking,
this is really minor.

But,

On 06/08, Ananth N Mavinakayanahalli wrote:
>
> Changes in V2:
> Pass (unsigned long)addr

Well, IMHO, this is confusing.

First of all, why do we have this "addr" or even "vaddr"? It should
not exists. We pass it to copy_insn(), but for what?? copy_insn()
should simply use uprobe->offset, the virtual address for this
particular mapping does not matter at all. I am going to send
the cleanup.

Note also that we should move this !UPROBE_COPY_INSN from
install_breakpoint() to somewhere near alloc_uprobe(). This code
is called only once, it looks a bit strange to use the "random" mm
(the first mm vma_prio_tree_foreach() finds) and its mapping to
verify the insn. In fact this is simply not correct and should be
fixed, note that on x86 arch_uprobe_analyze_insn() checks
mm->context.ia32_compat.

IOW, Perhaps uprobe->offset makes more sense?

> --- linux-3.5-rc1.orig/kernel/events/uprobes.c
> +++ linux-3.5-rc1/kernel/events/uprobes.c
> @@ -697,7 +697,7 @@ install_breakpoint(struct uprobe *uprobe
>  		if (is_swbp_insn((uprobe_opcode_t *)uprobe->arch.insn))
>  			return -EEXIST;
>
> -		ret = arch_uprobe_analyze_insn(&uprobe->arch, mm);
> +		ret = arch_uprobe_analyze_insn(&uprobe->arch, mm, addr);

Just fyi, this conflicts with
"[PATCH 1/3] uprobes: install_breakpoint() should fail if is_swbp_insn() == T"
I sent, but the conflict is trivial.

Oleg.
Oleg Nesterov June 11, 2012, 7:09 p.m. UTC | #2
On 06/11, Oleg Nesterov wrote:
>
> Note also that we should move this !UPROBE_COPY_INSN from
> install_breakpoint() to somewhere near alloc_uprobe().

The main problem is, uprobe_register() doesn't have struct file
for read_mapping_page().

Stupid question. I'm afraid the answer is "no" but I'll ask anyway.
Is it safe to pass filp == NULL to mapping->readpage()? In fact
I do not understand why it needs "struct file*" and I do not see
any example of actual usage. Yes, I didn't try to grep very much
and I understand that the filesystem can do something special.
Say it can use file->private_data...

However. There is read_cache_page_gfp() which does use
a_ops->readpage(filp => NULL), and the comment says nothing about
the riskiness.


If not, is there any other way
uprobe_register(struct inode *inode, loff_t offset) can read the
page at this offset?



And btw, read_mapping_page() accepts "void *data". Why? it uses
filler == a_ops->readpage, it shouldn't accept anything but file
pointer?


Please help, I know nothing about vfs.

Oleg.
Srikar Dronamraju June 12, 2012, 4:54 p.m. UTC | #3
> 
> Well, IMHO, this is confusing.
> 
> First of all, why do we have this "addr" or even "vaddr"? It should
> not exists. We pass it to copy_insn(), but for what?? copy_insn()
> should simply use uprobe->offset, the virtual address for this
> particular mapping does not matter at all. I am going to send
> the cleanup.
> 

Yes, we can use uprobe->offset instead of vaddr/addr.

> Note also that we should move this !UPROBE_COPY_INSN from
> install_breakpoint() to somewhere near alloc_uprobe(). This code
> is called only once, it looks a bit strange to use the "random" mm
> (the first mm vma_prio_tree_foreach() finds) and its mapping to
> verify the insn. In fact this is simply not correct and should be
> fixed, note that on x86 arch_uprobe_analyze_insn() checks

The reason we "delay" the copy_insn to the first insert is because 
we have to get access to mm. For archs like x86, we want to know if the
executable is 32 bit or not (since we have a different valid set of
instructions for 32 bit and 64 bit). So in effect, if we get access to
struct file corresponding to the inode and if the inode corresponds to
32 bit executable file or 64 bit executable file during register, then
we can move it around alloc_uprobe().
Oleg Nesterov June 12, 2012, 5:43 p.m. UTC | #4
On 06/12, Srikar Dronamraju wrote:
> >
> > Note also that we should move this !UPROBE_COPY_INSN from
> > install_breakpoint() to somewhere near alloc_uprobe(). This code
> > is called only once, it looks a bit strange to use the "random" mm
> > (the first mm vma_prio_tree_foreach() finds) and its mapping to
> > verify the insn. In fact this is simply not correct and should be
> > fixed, note that on x86 arch_uprobe_analyze_insn() checks
>
> The reason we "delay" the copy_insn to the first insert is because
> we have to get access to mm. For archs like x86, we want to know if the
> executable is 32 bit or not

Yes. And this is wrong afaics.

Once again. This !UPROBE_COPY_INSN code is called only once, and it
uses the "random" mm. After that install_breakpoint() just calls
set_swbp(another_mm) while the insn can be invalid because
another_mm->ia32_compat != mm->ia32_compat.

> So in effect, if we get access to
> struct file corresponding to the inode and if the inode corresponds to
> 32 bit executable file or 64 bit executable file during register, then
> we can move it around alloc_uprobe().

I don't think this can work. I have another simple fix in mind, I'll
write another email later.

Oleg.
Peter Zijlstra June 13, 2012, 9:58 a.m. UTC | #5
On Mon, 2012-06-11 at 21:09 +0200, Oleg Nesterov wrote:
> Stupid question. I'm afraid the answer is "no" but I'll ask anyway.
> Is it safe to pass filp == NULL to mapping->readpage()? In fact
> I do not understand why it needs "struct file*" and I do not see
> any example of actual usage. 

Looking at afs_readpage it looks like its OK to pass in NULL. Same for
nfs_readpage. They use the file, if provided, to avoid some lookups, but
seem to deal with not having it.

This answer by example is of course not authorative nor complete.
Oleg Nesterov June 13, 2012, 7:15 p.m. UTC | #6
On 06/12, Oleg Nesterov wrote:
>
> On 06/12, Srikar Dronamraju wrote:
> > >
> > > Note also that we should move this !UPROBE_COPY_INSN from
> > > install_breakpoint() to somewhere near alloc_uprobe(). This code
> > > is called only once, it looks a bit strange to use the "random" mm
> > > (the first mm vma_prio_tree_foreach() finds) and its mapping to
> > > verify the insn. In fact this is simply not correct and should be
> > > fixed, note that on x86 arch_uprobe_analyze_insn() checks
> >
> > The reason we "delay" the copy_insn to the first insert is because
> > we have to get access to mm. For archs like x86, we want to know if the
> > executable is 32 bit or not
>
> Yes. And this is wrong afaics.
>
> Once again. This !UPROBE_COPY_INSN code is called only once, and it
> uses the "random" mm. After that install_breakpoint() just calls
> set_swbp(another_mm) while the insn can be invalid because
> another_mm->ia32_compat != mm->ia32_compat.
>
> > So in effect, if we get access to
> > struct file corresponding to the inode and if the inode corresponds to
> > 32 bit executable file or 64 bit executable file during register, then
> > we can move it around alloc_uprobe().
>
> I don't think this can work. I have another simple fix in mind, I'll
> write another email later.

For example. Suppose there is some instruction in /lib64/libc.so which
is valid for 64-bit, but not for 32-bit.

Suppose that a 32-bit application does mmap("/lib64/libc.so", PROT_EXEC).

Now. If vma_prio_tree_foreach() finds this 32-bit mm first, uprobe_register()
fails even if there are other 64-bit applications which could be traced.

Or. uprobe_register() succeeds because it finds a 64-bit mm first, and
then that 32-bit application actually executes the invalid insn.

We can move arch_uprobe_analyze_insn() outside of !UPROBE_COPY_INSN block.

Or, perhaps, validate_insn_bits() should call both
validate_insn_32bits() and validate_insn_64bits(), and set the
UPROBE_VALID_IF_32 / UPROBE_VALID_IF_64 flags. install_breakpoint()
should do the additinal check before set_swbp() and verify that
.ia32_compat matches UPROBE_VALID_IF_*.

What do you think?

Oleg.
Oleg Nesterov June 13, 2012, 7:19 p.m. UTC | #7
On 06/13, Peter Zijlstra wrote:
>
> On Mon, 2012-06-11 at 21:09 +0200, Oleg Nesterov wrote:
> > Stupid question. I'm afraid the answer is "no" but I'll ask anyway.
> > Is it safe to pass filp == NULL to mapping->readpage()? In fact
> > I do not understand why it needs "struct file*" and I do not see
> > any example of actual usage.
>
> Looking at afs_readpage it looks like its OK to pass in NULL. Same for
> nfs_readpage. They use the file, if provided, to avoid some lookups, but
> seem to deal with not having it.

Yes, and reiser4 does the same.

> This answer by example is of course not authorative nor complete.

Yes... perhaps we should simply change this code to use NULL and
collect the bug-reports ;)

Oleg.
Srikar Dronamraju June 14, 2012, 11:45 a.m. UTC | #8
* Oleg Nesterov <oleg@redhat.com> [2012-06-13 21:15:19]:

> On 06/12, Oleg Nesterov wrote:
> >
> > On 06/12, Srikar Dronamraju wrote:
> > > >
> > > > Note also that we should move this !UPROBE_COPY_INSN from
> > > > install_breakpoint() to somewhere near alloc_uprobe(). This code
> > > > is called only once, it looks a bit strange to use the "random" mm
> > > > (the first mm vma_prio_tree_foreach() finds) and its mapping to
> > > > verify the insn. In fact this is simply not correct and should be
> > > > fixed, note that on x86 arch_uprobe_analyze_insn() checks
> > >
> > > The reason we "delay" the copy_insn to the first insert is because
> > > we have to get access to mm. For archs like x86, we want to know if the
> > > executable is 32 bit or not
> >
> > Yes. And this is wrong afaics.
> >
> > Once again. This !UPROBE_COPY_INSN code is called only once, and it
> > uses the "random" mm. After that install_breakpoint() just calls
> > set_swbp(another_mm) while the insn can be invalid because
> > another_mm->ia32_compat != mm->ia32_compat.
> >
> > > So in effect, if we get access to
> > > struct file corresponding to the inode and if the inode corresponds to
> > > 32 bit executable file or 64 bit executable file during register, then
> > > we can move it around alloc_uprobe().
> >
> > I don't think this can work. I have another simple fix in mind, I'll
> > write another email later.
> 
> For example. Suppose there is some instruction in /lib64/libc.so which
> is valid for 64-bit, but not for 32-bit.
> 
> Suppose that a 32-bit application does mmap("/lib64/libc.so", PROT_EXEC).
> 

How correct is it to have a 32 bit binary link to a 64 bit binary/library?
what if the 64 bit binary/executable were to make a jump to a 64 bit
address? 


> Now. If vma_prio_tree_foreach() finds this 32-bit mm first, uprobe_register()
> fails even if there are other 64-bit applications which could be traced.
> 
> Or. uprobe_register() succeeds because it finds a 64-bit mm first, and
> then that 32-bit application actually executes the invalid insn.
> 
> We can move arch_uprobe_analyze_insn() outside of !UPROBE_COPY_INSN block.
> 
> Or, perhaps, validate_insn_bits() should call both
> validate_insn_32bits() and validate_insn_64bits(), and set the
> UPROBE_VALID_IF_32 / UPROBE_VALID_IF_64 flags. install_breakpoint()
> should do the additinal check before set_swbp() and verify that
> .ia32_compat matches UPROBE_VALID_IF_*.
> 

> What do you think?
> 

Lets say we do find a 32 bit app and 64 bit app using the same library
and the underlying instruction is valid for tracing in 64 bit and not 32
bit. So when we are registering, and failed to insert a breakpoint  for
the 32 bit app, should we just bail out or should we return a failure?

I would probably prefer to read the underlying file something similar to
what exec does and based on the magic decipher if we should verify for
32 bit instructions or 64 bit instructions.

I didnt find a good way to read the first few bytes just by looking at
the inode. So one option is to read the underlying file at the first
insertion (i.e similar to what we do now .. but instead of depending on
ia32_compat of a random mm, depend on the exec magic)

Better option would be read the underlying file at the register time and
return an error even without attempting an insert if the instruction
wasnt valid. But I am still ignorant on how to do this because we need a
struct file to do this.
Oleg Nesterov June 14, 2012, 6:19 p.m. UTC | #9
On 06/14, Srikar Dronamraju wrote:
>
> * Oleg Nesterov <oleg@redhat.com> [2012-06-13 21:15:19]:
>
> > For example. Suppose there is some instruction in /lib64/libc.so which
> > is valid for 64-bit, but not for 32-bit.
> >
> > Suppose that a 32-bit application does mmap("/lib64/libc.so", PROT_EXEC).
> >
>
> How correct is it to have a 32 bit binary link to a 64 bit binary/library?

No, I didn't mean this. I guess you misunderstood my point, see below.

> > Now. If vma_prio_tree_foreach() finds this 32-bit mm first, uprobe_register()
> > fails even if there are other 64-bit applications which could be traced.
> >
> > Or. uprobe_register() succeeds because it finds a 64-bit mm first, and
> > then that 32-bit application actually executes the invalid insn.
> >
> > We can move arch_uprobe_analyze_insn() outside of !UPROBE_COPY_INSN block.
> >
> > Or, perhaps, validate_insn_bits() should call both
> > validate_insn_32bits() and validate_insn_64bits(), and set the
> > UPROBE_VALID_IF_32 / UPROBE_VALID_IF_64 flags. install_breakpoint()
> > should do the additinal check before set_swbp() and verify that
> > .ia32_compat matches UPROBE_VALID_IF_*.
> >
>
> > What do you think?
> >
>
> Lets say we do find a 32 bit app and 64 bit app using the same library
> and the underlying instruction is valid for tracing in 64 bit and not 32
> bit. So when we are registering, and failed to insert a breakpoint  for
> the 32 bit app, should we just bail out or should we return a failure?

I do not really know, I tend to think we should not fail. But this is
another story...

Look. Suppose that a 32-bit app starts after uprobe_register() succeeds.
In this case we have no option, uprobe_mmap()->install_breakpoint()
should "silently" fail. Currently it doesn't, this is one of the reasons
why I think the validation logic is wrong.

And. if install_breakpoint() can fail later anyway (in this case), then
I think uprobe_register() should not fail.

But probably this needs more discussion.


> I would probably prefer to read the underlying file something similar to
> what exec does and based on the magic decipher if we should verify for
> 32 bit instructions or 64 bit instructions.

But this can't protect from the malicious user who does
mmap(64-bit-code, PROT_EXEC) from a 32-bit app, and this can confuse
uprobes even if that 32-bit app never tries to actually execute that
64-bit-code.

That is why I think we need the additional (and arch-dependant) check
before every set_swbp(), but arch_uprobe_analyze_insn/etc should not
depend on task/mm/vaddr/whatever.

Oleg.
diff mbox

Patch

Index: linux-3.5-rc1/arch/x86/include/asm/uprobes.h
===================================================================
--- linux-3.5-rc1.orig/arch/x86/include/asm/uprobes.h
+++ linux-3.5-rc1/arch/x86/include/asm/uprobes.h
@@ -48,7 +48,7 @@  struct arch_uprobe_task {
 #endif
 };
 
-extern int  arch_uprobe_analyze_insn(struct arch_uprobe *aup, struct mm_struct *mm);
+extern int  arch_uprobe_analyze_insn(struct arch_uprobe *aup, struct mm_struct *mm, unsigned long addr);
 extern int  arch_uprobe_pre_xol(struct arch_uprobe *aup, struct pt_regs *regs);
 extern int  arch_uprobe_post_xol(struct arch_uprobe *aup, struct pt_regs *regs);
 extern bool arch_uprobe_xol_was_trapped(struct task_struct *tsk);
Index: linux-3.5-rc1/arch/x86/kernel/uprobes.c
===================================================================
--- linux-3.5-rc1.orig/arch/x86/kernel/uprobes.c
+++ linux-3.5-rc1/arch/x86/kernel/uprobes.c
@@ -409,9 +409,10 @@  static int validate_insn_bits(struct arc
  * arch_uprobe_analyze_insn - instruction analysis including validity and fixups.
  * @mm: the probed address space.
  * @arch_uprobe: the probepoint information.
+ * @addr: virtual address at which to install the probepoint
  * Return 0 on success or a -ve number on error.
  */
-int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm)
+int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm, unsigned long addr)
 {
 	int ret;
 	struct insn insn;
Index: linux-3.5-rc1/kernel/events/uprobes.c
===================================================================
--- linux-3.5-rc1.orig/kernel/events/uprobes.c
+++ linux-3.5-rc1/kernel/events/uprobes.c
@@ -697,7 +697,7 @@  install_breakpoint(struct uprobe *uprobe
 		if (is_swbp_insn((uprobe_opcode_t *)uprobe->arch.insn))
 			return -EEXIST;
 
-		ret = arch_uprobe_analyze_insn(&uprobe->arch, mm);
+		ret = arch_uprobe_analyze_insn(&uprobe->arch, mm, addr);
 		if (ret)
 			return ret;