Message ID | 20120608093257.GG13409@in.ibm.com (mailing list archive) |
---|---|
State | Changes Requested, archived |
Headers | show |
Ananth, Srikar, I think the patch is correct and I am sorry for nit-picking, this is really minor. But, On 06/08, Ananth N Mavinakayanahalli wrote: > > Changes in V2: > Pass (unsigned long)addr Well, IMHO, this is confusing. First of all, why do we have this "addr" or even "vaddr"? It should not exists. We pass it to copy_insn(), but for what?? copy_insn() should simply use uprobe->offset, the virtual address for this particular mapping does not matter at all. I am going to send the cleanup. Note also that we should move this !UPROBE_COPY_INSN from install_breakpoint() to somewhere near alloc_uprobe(). This code is called only once, it looks a bit strange to use the "random" mm (the first mm vma_prio_tree_foreach() finds) and its mapping to verify the insn. In fact this is simply not correct and should be fixed, note that on x86 arch_uprobe_analyze_insn() checks mm->context.ia32_compat. IOW, Perhaps uprobe->offset makes more sense? > --- linux-3.5-rc1.orig/kernel/events/uprobes.c > +++ linux-3.5-rc1/kernel/events/uprobes.c > @@ -697,7 +697,7 @@ install_breakpoint(struct uprobe *uprobe > if (is_swbp_insn((uprobe_opcode_t *)uprobe->arch.insn)) > return -EEXIST; > > - ret = arch_uprobe_analyze_insn(&uprobe->arch, mm); > + ret = arch_uprobe_analyze_insn(&uprobe->arch, mm, addr); Just fyi, this conflicts with "[PATCH 1/3] uprobes: install_breakpoint() should fail if is_swbp_insn() == T" I sent, but the conflict is trivial. Oleg.
On 06/11, Oleg Nesterov wrote: > > Note also that we should move this !UPROBE_COPY_INSN from > install_breakpoint() to somewhere near alloc_uprobe(). The main problem is, uprobe_register() doesn't have struct file for read_mapping_page(). Stupid question. I'm afraid the answer is "no" but I'll ask anyway. Is it safe to pass filp == NULL to mapping->readpage()? In fact I do not understand why it needs "struct file*" and I do not see any example of actual usage. Yes, I didn't try to grep very much and I understand that the filesystem can do something special. Say it can use file->private_data... However. There is read_cache_page_gfp() which does use a_ops->readpage(filp => NULL), and the comment says nothing about the riskiness. If not, is there any other way uprobe_register(struct inode *inode, loff_t offset) can read the page at this offset? And btw, read_mapping_page() accepts "void *data". Why? it uses filler == a_ops->readpage, it shouldn't accept anything but file pointer? Please help, I know nothing about vfs. Oleg.
> > Well, IMHO, this is confusing. > > First of all, why do we have this "addr" or even "vaddr"? It should > not exists. We pass it to copy_insn(), but for what?? copy_insn() > should simply use uprobe->offset, the virtual address for this > particular mapping does not matter at all. I am going to send > the cleanup. > Yes, we can use uprobe->offset instead of vaddr/addr. > Note also that we should move this !UPROBE_COPY_INSN from > install_breakpoint() to somewhere near alloc_uprobe(). This code > is called only once, it looks a bit strange to use the "random" mm > (the first mm vma_prio_tree_foreach() finds) and its mapping to > verify the insn. In fact this is simply not correct and should be > fixed, note that on x86 arch_uprobe_analyze_insn() checks The reason we "delay" the copy_insn to the first insert is because we have to get access to mm. For archs like x86, we want to know if the executable is 32 bit or not (since we have a different valid set of instructions for 32 bit and 64 bit). So in effect, if we get access to struct file corresponding to the inode and if the inode corresponds to 32 bit executable file or 64 bit executable file during register, then we can move it around alloc_uprobe().
On 06/12, Srikar Dronamraju wrote: > > > > Note also that we should move this !UPROBE_COPY_INSN from > > install_breakpoint() to somewhere near alloc_uprobe(). This code > > is called only once, it looks a bit strange to use the "random" mm > > (the first mm vma_prio_tree_foreach() finds) and its mapping to > > verify the insn. In fact this is simply not correct and should be > > fixed, note that on x86 arch_uprobe_analyze_insn() checks > > The reason we "delay" the copy_insn to the first insert is because > we have to get access to mm. For archs like x86, we want to know if the > executable is 32 bit or not Yes. And this is wrong afaics. Once again. This !UPROBE_COPY_INSN code is called only once, and it uses the "random" mm. After that install_breakpoint() just calls set_swbp(another_mm) while the insn can be invalid because another_mm->ia32_compat != mm->ia32_compat. > So in effect, if we get access to > struct file corresponding to the inode and if the inode corresponds to > 32 bit executable file or 64 bit executable file during register, then > we can move it around alloc_uprobe(). I don't think this can work. I have another simple fix in mind, I'll write another email later. Oleg.
On Mon, 2012-06-11 at 21:09 +0200, Oleg Nesterov wrote: > Stupid question. I'm afraid the answer is "no" but I'll ask anyway. > Is it safe to pass filp == NULL to mapping->readpage()? In fact > I do not understand why it needs "struct file*" and I do not see > any example of actual usage. Looking at afs_readpage it looks like its OK to pass in NULL. Same for nfs_readpage. They use the file, if provided, to avoid some lookups, but seem to deal with not having it. This answer by example is of course not authorative nor complete.
On 06/12, Oleg Nesterov wrote: > > On 06/12, Srikar Dronamraju wrote: > > > > > > Note also that we should move this !UPROBE_COPY_INSN from > > > install_breakpoint() to somewhere near alloc_uprobe(). This code > > > is called only once, it looks a bit strange to use the "random" mm > > > (the first mm vma_prio_tree_foreach() finds) and its mapping to > > > verify the insn. In fact this is simply not correct and should be > > > fixed, note that on x86 arch_uprobe_analyze_insn() checks > > > > The reason we "delay" the copy_insn to the first insert is because > > we have to get access to mm. For archs like x86, we want to know if the > > executable is 32 bit or not > > Yes. And this is wrong afaics. > > Once again. This !UPROBE_COPY_INSN code is called only once, and it > uses the "random" mm. After that install_breakpoint() just calls > set_swbp(another_mm) while the insn can be invalid because > another_mm->ia32_compat != mm->ia32_compat. > > > So in effect, if we get access to > > struct file corresponding to the inode and if the inode corresponds to > > 32 bit executable file or 64 bit executable file during register, then > > we can move it around alloc_uprobe(). > > I don't think this can work. I have another simple fix in mind, I'll > write another email later. For example. Suppose there is some instruction in /lib64/libc.so which is valid for 64-bit, but not for 32-bit. Suppose that a 32-bit application does mmap("/lib64/libc.so", PROT_EXEC). Now. If vma_prio_tree_foreach() finds this 32-bit mm first, uprobe_register() fails even if there are other 64-bit applications which could be traced. Or. uprobe_register() succeeds because it finds a 64-bit mm first, and then that 32-bit application actually executes the invalid insn. We can move arch_uprobe_analyze_insn() outside of !UPROBE_COPY_INSN block. Or, perhaps, validate_insn_bits() should call both validate_insn_32bits() and validate_insn_64bits(), and set the UPROBE_VALID_IF_32 / UPROBE_VALID_IF_64 flags. install_breakpoint() should do the additinal check before set_swbp() and verify that .ia32_compat matches UPROBE_VALID_IF_*. What do you think? Oleg.
On 06/13, Peter Zijlstra wrote: > > On Mon, 2012-06-11 at 21:09 +0200, Oleg Nesterov wrote: > > Stupid question. I'm afraid the answer is "no" but I'll ask anyway. > > Is it safe to pass filp == NULL to mapping->readpage()? In fact > > I do not understand why it needs "struct file*" and I do not see > > any example of actual usage. > > Looking at afs_readpage it looks like its OK to pass in NULL. Same for > nfs_readpage. They use the file, if provided, to avoid some lookups, but > seem to deal with not having it. Yes, and reiser4 does the same. > This answer by example is of course not authorative nor complete. Yes... perhaps we should simply change this code to use NULL and collect the bug-reports ;) Oleg.
* Oleg Nesterov <oleg@redhat.com> [2012-06-13 21:15:19]: > On 06/12, Oleg Nesterov wrote: > > > > On 06/12, Srikar Dronamraju wrote: > > > > > > > > Note also that we should move this !UPROBE_COPY_INSN from > > > > install_breakpoint() to somewhere near alloc_uprobe(). This code > > > > is called only once, it looks a bit strange to use the "random" mm > > > > (the first mm vma_prio_tree_foreach() finds) and its mapping to > > > > verify the insn. In fact this is simply not correct and should be > > > > fixed, note that on x86 arch_uprobe_analyze_insn() checks > > > > > > The reason we "delay" the copy_insn to the first insert is because > > > we have to get access to mm. For archs like x86, we want to know if the > > > executable is 32 bit or not > > > > Yes. And this is wrong afaics. > > > > Once again. This !UPROBE_COPY_INSN code is called only once, and it > > uses the "random" mm. After that install_breakpoint() just calls > > set_swbp(another_mm) while the insn can be invalid because > > another_mm->ia32_compat != mm->ia32_compat. > > > > > So in effect, if we get access to > > > struct file corresponding to the inode and if the inode corresponds to > > > 32 bit executable file or 64 bit executable file during register, then > > > we can move it around alloc_uprobe(). > > > > I don't think this can work. I have another simple fix in mind, I'll > > write another email later. > > For example. Suppose there is some instruction in /lib64/libc.so which > is valid for 64-bit, but not for 32-bit. > > Suppose that a 32-bit application does mmap("/lib64/libc.so", PROT_EXEC). > How correct is it to have a 32 bit binary link to a 64 bit binary/library? what if the 64 bit binary/executable were to make a jump to a 64 bit address? > Now. If vma_prio_tree_foreach() finds this 32-bit mm first, uprobe_register() > fails even if there are other 64-bit applications which could be traced. > > Or. uprobe_register() succeeds because it finds a 64-bit mm first, and > then that 32-bit application actually executes the invalid insn. > > We can move arch_uprobe_analyze_insn() outside of !UPROBE_COPY_INSN block. > > Or, perhaps, validate_insn_bits() should call both > validate_insn_32bits() and validate_insn_64bits(), and set the > UPROBE_VALID_IF_32 / UPROBE_VALID_IF_64 flags. install_breakpoint() > should do the additinal check before set_swbp() and verify that > .ia32_compat matches UPROBE_VALID_IF_*. > > What do you think? > Lets say we do find a 32 bit app and 64 bit app using the same library and the underlying instruction is valid for tracing in 64 bit and not 32 bit. So when we are registering, and failed to insert a breakpoint for the 32 bit app, should we just bail out or should we return a failure? I would probably prefer to read the underlying file something similar to what exec does and based on the magic decipher if we should verify for 32 bit instructions or 64 bit instructions. I didnt find a good way to read the first few bytes just by looking at the inode. So one option is to read the underlying file at the first insertion (i.e similar to what we do now .. but instead of depending on ia32_compat of a random mm, depend on the exec magic) Better option would be read the underlying file at the register time and return an error even without attempting an insert if the instruction wasnt valid. But I am still ignorant on how to do this because we need a struct file to do this.
On 06/14, Srikar Dronamraju wrote: > > * Oleg Nesterov <oleg@redhat.com> [2012-06-13 21:15:19]: > > > For example. Suppose there is some instruction in /lib64/libc.so which > > is valid for 64-bit, but not for 32-bit. > > > > Suppose that a 32-bit application does mmap("/lib64/libc.so", PROT_EXEC). > > > > How correct is it to have a 32 bit binary link to a 64 bit binary/library? No, I didn't mean this. I guess you misunderstood my point, see below. > > Now. If vma_prio_tree_foreach() finds this 32-bit mm first, uprobe_register() > > fails even if there are other 64-bit applications which could be traced. > > > > Or. uprobe_register() succeeds because it finds a 64-bit mm first, and > > then that 32-bit application actually executes the invalid insn. > > > > We can move arch_uprobe_analyze_insn() outside of !UPROBE_COPY_INSN block. > > > > Or, perhaps, validate_insn_bits() should call both > > validate_insn_32bits() and validate_insn_64bits(), and set the > > UPROBE_VALID_IF_32 / UPROBE_VALID_IF_64 flags. install_breakpoint() > > should do the additinal check before set_swbp() and verify that > > .ia32_compat matches UPROBE_VALID_IF_*. > > > > > What do you think? > > > > Lets say we do find a 32 bit app and 64 bit app using the same library > and the underlying instruction is valid for tracing in 64 bit and not 32 > bit. So when we are registering, and failed to insert a breakpoint for > the 32 bit app, should we just bail out or should we return a failure? I do not really know, I tend to think we should not fail. But this is another story... Look. Suppose that a 32-bit app starts after uprobe_register() succeeds. In this case we have no option, uprobe_mmap()->install_breakpoint() should "silently" fail. Currently it doesn't, this is one of the reasons why I think the validation logic is wrong. And. if install_breakpoint() can fail later anyway (in this case), then I think uprobe_register() should not fail. But probably this needs more discussion. > I would probably prefer to read the underlying file something similar to > what exec does and based on the magic decipher if we should verify for > 32 bit instructions or 64 bit instructions. But this can't protect from the malicious user who does mmap(64-bit-code, PROT_EXEC) from a 32-bit app, and this can confuse uprobes even if that 32-bit app never tries to actually execute that 64-bit-code. That is why I think we need the additional (and arch-dependant) check before every set_swbp(), but arch_uprobe_analyze_insn/etc should not depend on task/mm/vaddr/whatever. Oleg.
Index: linux-3.5-rc1/arch/x86/include/asm/uprobes.h =================================================================== --- linux-3.5-rc1.orig/arch/x86/include/asm/uprobes.h +++ linux-3.5-rc1/arch/x86/include/asm/uprobes.h @@ -48,7 +48,7 @@ struct arch_uprobe_task { #endif }; -extern int arch_uprobe_analyze_insn(struct arch_uprobe *aup, struct mm_struct *mm); +extern int arch_uprobe_analyze_insn(struct arch_uprobe *aup, struct mm_struct *mm, unsigned long addr); extern int arch_uprobe_pre_xol(struct arch_uprobe *aup, struct pt_regs *regs); extern int arch_uprobe_post_xol(struct arch_uprobe *aup, struct pt_regs *regs); extern bool arch_uprobe_xol_was_trapped(struct task_struct *tsk); Index: linux-3.5-rc1/arch/x86/kernel/uprobes.c =================================================================== --- linux-3.5-rc1.orig/arch/x86/kernel/uprobes.c +++ linux-3.5-rc1/arch/x86/kernel/uprobes.c @@ -409,9 +409,10 @@ static int validate_insn_bits(struct arc * arch_uprobe_analyze_insn - instruction analysis including validity and fixups. * @mm: the probed address space. * @arch_uprobe: the probepoint information. + * @addr: virtual address at which to install the probepoint * Return 0 on success or a -ve number on error. */ -int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm) +int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm, unsigned long addr) { int ret; struct insn insn; Index: linux-3.5-rc1/kernel/events/uprobes.c =================================================================== --- linux-3.5-rc1.orig/kernel/events/uprobes.c +++ linux-3.5-rc1/kernel/events/uprobes.c @@ -697,7 +697,7 @@ install_breakpoint(struct uprobe *uprobe if (is_swbp_insn((uprobe_opcode_t *)uprobe->arch.insn)) return -EEXIST; - ret = arch_uprobe_analyze_insn(&uprobe->arch, mm); + ret = arch_uprobe_analyze_insn(&uprobe->arch, mm, addr); if (ret) return ret;