Message ID | 1347057778.26695.68.camel@sbsiddha-desk.sc.intel.com |
---|---|
State | New, archived |
Headers | show |
On Fri, Sep 7, 2012 at 3:42 PM, Suresh Siddha <suresh.b.siddha@intel.com> wrote: > - unsigned long start; > - unsigned long off; > - u32 len; > + resource_size_t start, off; > + unsigned long len; So since the oops is on x86-64, I don't think it's the "unsigned long" -> "resource_size_t" part (which can be an issue on 32-bit architectures, though). The "u32 len" -> "unsigned long len" thing *might* make a difference, though. I also think your patch is incomplete even on 32-bit, because this: > if (mtd->type == MTD_RAM || mtd->type == MTD_ROM) { > off = vma->vm_pgoff << PAGE_SHIFT; is still wrong. It probably should be off = vma->vm_pgoff; off <<= PAGE_SHIFT; because vm_pgoff may be a 32-bit type, while "resource_size_t" may be 64-bit. Shifting the 32-bit type without a cast (implicit or explicit) isn't going to help. That said, we have absolutely *tons* of bugs with this particular pattern. Just do git grep 'vm_pgoff.*<<.*PAGE_SHIFT' and there are distressingly few casts in there (there's a few, mainly in fs/proc). Now, I suspect many of them are fine just because most users probably are size-limited anyway, but it's a bit distressing stuff. And I suspect it means we might want to introduce a helper function like static inline u64 vm_offset(struct vm_area_struct *vma) { return (u64)vma->vm_pgoff << PAGE_SHIFT; } or something. Maybe add the "vm_length()" helper while at it too, since the whole "vma->vm_end - vma->vm_start" thing is so common. Anyway, since Sasha's oops is clearly not 32-bit, the above issues don't matter, and it would be interesting to hear if it's the 32-bit 'len' thing that triggers this problem. Still, I can't see how it would - as far as I can tell, a truncated 'len' would at most result in spurious early "return -EINVAL", not any real problem. What are we missing? Sasha, since you can apparently reproduce it, can you replace the "BUG_ON()" with just a if (start >= end) { printf("bogus range %llx - %llx\n", start, end); return -EINVAL; } or something. I'm starting to suspect that maybe it's actually that the length is *zero*, and start == end, and that we should just return zero for that case. But let's see what Sasha finds.. Linus
On Fri, 2012-09-07 at 16:09 -0700, Linus Torvalds wrote: > The "u32 len" -> "unsigned long len" thing *might* make a difference, though. This I believe doesn't fix the reported BUG. I was trying to address your previous comment about broken types. > > I also think your patch is incomplete even on 32-bit, because this: > > > if (mtd->type == MTD_RAM || mtd->type == MTD_ROM) { > > off = vma->vm_pgoff << PAGE_SHIFT; > > is still wrong. It probably should be > > off = vma->vm_pgoff; > off <<= PAGE_SHIFT; > > because vm_pgoff may be a 32-bit type, while "resource_size_t" may be > 64-bit. Shifting the 32-bit type without a cast (implicit or explicit) > isn't going to help. Agree. > That said, we have absolutely *tons* of bugs with this particular > pattern. Just do > > git grep 'vm_pgoff.*<<.*PAGE_SHIFT' > > and there are distressingly few casts in there (there's a few, mainly > in fs/proc). > > Now, I suspect many of them are fine just because most users probably > are size-limited anyway, but it's a bit distressing stuff. And I > suspect it means we might want to introduce a helper function like > > static inline u64 vm_offset(struct vm_area_struct *vma) > { > return (u64)vma->vm_pgoff << PAGE_SHIFT; > } > > or something. Maybe add the "vm_length()" helper while at it too, > since the whole "vma->vm_end - vma->vm_start" thing is so common. Agree. > Anyway, since Sasha's oops is clearly not 32-bit, the above issues > don't matter, and it would be interesting to hear if it's the 32-bit > 'len' thing that triggers this problem. Still, I can't see how it > would - as far as I can tell, a truncated 'len' would at most result > in spurious early "return -EINVAL", not any real problem. > > What are we missing? > On Fri, 2012-09-07 at 15:42 -0700, Suresh Siddha wrote: > - if ((vma->vm_end - vma->vm_start + off) > len) > + if (off >= len || (vma->vm_end - vma->vm_start + off) > len) > return -EINVAL; This is the relevant portion that I am thinking will address the BUG. Essentially the user is trying to mmap at a very large offset (from the oops it appears "vma->vm_pgoff << PAGE_SHIFT + start" ends up to "0xfffffffffffff000"). So it appears that the condition "(vma->vm_end - vma->vm_start + off) > len" might be false because of the wraparound? and doesn't return -EINVAL. Let's see what Sasha finds. Anyways the patch does indeed require your above mentioned vm_pgoff fix for the 32-bit case. thanks, suresh
On 09/08/2012 01:09 AM, Linus Torvalds wrote: > Sasha, since you can apparently reproduce it, can you replace the > "BUG_ON()" with just a > > if (start >= end) { > printf("bogus range %llx - %llx\n", start, end); > return -EINVAL; > } Replacing it gives me the following: [ 36.231736] bogus range fffffffffffff000 - 0 Thanks, Sasha
On Fri, Sep 7, 2012 at 4:54 PM, Suresh Siddha <suresh.b.siddha@intel.com> wrote: > > Essentially the user is trying to mmap at a very large offset (from the > oops it appears "vma->vm_pgoff << PAGE_SHIFT + start" ends up to > "0xfffffffffffff000"). Ok, Sasha confirmed that. > So it appears that the condition "(vma->vm_end - vma->vm_start + off) > > len" might be false because of the wraparound? and doesn't return > -EINVAL. Ack. Anyway, that means that the BUG_ON() is likely bogus, but so is the whole calling convention. The 4kB range starting at 0xfffffffffffff000 sounds like a *valid* range, but that requires that we fix the calling convention to not have that "end" (exclusive) thing. It should either be "end" (inclusive), or just "len". So it should either be start=0xfffffffffffff000 end=0xffffffffffffffff or it should be start=0xfffffffffffff000 len=0x1000. Or we need to say that we must never accept things at the end of the 64-bit range. Whatever. Something like this (TOTALLY UNTESTED) attached patch should get the mtdchar overflows to go away, but it does *not* fix the fact that the MTRR start/end model is broken. It really is technically valid to have a resource_size_t range of 0xfffffffffffff000+0x1000, and right now it causes a BUG_ON() in pat.c. Suresh? Linus
On Sat, 2012-09-08 at 12:57 -0700, Linus Torvalds wrote: > Whatever. Something like this (TOTALLY UNTESTED) attached patch should > get the mtdchar overflows to go away, It looks good to me. Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Sasha, can you please give this a try? > but it does *not* fix the fact > that the MTRR start/end model is broken. It really is technically > valid to have a resource_size_t range of 0xfffffffffffff000+0x1000, > and right now it causes a BUG_ON() in pat.c. > > Suresh? yes but that is not a valid range I think because of the supported physical address bit limits of the processor and also the max architecture limit of 52 address bits. I guess we should be checking for those limits in pat.c, especially bits above 52 are ignored by the HW and they can easily cause conflicting aliases with other valid regions. I will get back with a different patch to fix this. thanks, suresh
On Sun, Sep 9, 2012 at 7:56 AM, Suresh Siddha <suresh.b.siddha@intel.com> wrote: > > yes but that is not a valid range I think because of the supported > physical address bit limits of the processor and also the max > architecture limit of 52 address bits. But how could the caller possibly know that? None of those internal PAT limits are exposed anywhere. So doing the BUG_ON() is wrong. I'd suggest changing it to an EINVAL. In fact, BUG_ON() is *always* wrong, unless it's a "my internal data structures are so messed up that I cannot continue". Linus
On 09/08/2012 12:57 PM, Linus Torvalds wrote: > > Ack. > > Anyway, that means that the BUG_ON() is likely bogus, but so is the > whole calling convention. > > The 4kB range starting at 0xfffffffffffff000 sounds like a *valid* > range, but that requires that we fix the calling convention to not > have that "end" (exclusive) thing. It should either be "end" > (inclusive), or just "len". > On x86, it is definitely NOT a valid range. There is no physical addresses there, and there will never be any. > So it should either be start=0xfffffffffffff000 end=0xffffffffffffffff > or it should be start=0xfffffffffffff000 len=0x1000. I would strongly object to the former; that kind of inclusive ranges breed a whole class of bugs by themselves. -hpa
On 09/09/2012 08:31 AM, Linus Torvalds wrote: > On Sun, Sep 9, 2012 at 7:56 AM, Suresh Siddha <suresh.b.siddha@intel.com> wrote: >> >> yes but that is not a valid range I think because of the supported >> physical address bit limits of the processor and also the max >> architecture limit of 52 address bits. > > But how could the caller possibly know that? None of those internal > PAT limits are exposed anywhere. > > So doing the BUG_ON() is wrong. I'd suggest changing it to an EINVAL. > > In fact, BUG_ON() is *always* wrong, unless it's a "my internal data > structures are so messed up that I cannot continue". > I suspect the right answer is doing something like: u64 max_phys = 1ULL << boot_cpu_data.x86_phys_bits; if (start >= max_phys || end > max_phys || start >= end) return -EINVAL; ... although max_phys perhaps should be precalculated and stored in struct cpuinfo_x86 instead of being generated de novo. -hpa
On Sun, 2012-09-09 at 09:56 -0700, H. Peter Anvin wrote: > > > So it should either be start=0xfffffffffffff000 end=0xffffffffffffffff > > or it should be start=0xfffffffffffff000 len=0x1000. > > I would strongly object to the former; that kind of inclusive ranges > breed a whole class of bugs by themselves. Another alternative that avoids overflow issues is to use a PFN rather than a byte address.
On 09/09/2012 12:04 PM, David Woodhouse wrote: > On Sun, 2012-09-09 at 09:56 -0700, H. Peter Anvin wrote: >> >>> So it should either be start=0xfffffffffffff000 end=0xffffffffffffffff >>> or it should be start=0xfffffffffffff000 len=0x1000. >> >> I would strongly object to the former; that kind of inclusive ranges >> breed a whole class of bugs by themselves. > > Another alternative that avoids overflow issues is to use a PFN rather > than a byte address. > Except as a result of that logic have a bunch of places which either have rounding errors in how they calculate PFNs, or they think they can stick PFNs into 32-bit numbers. :( -hpa
On 09/09/2012 06:56 PM, H. Peter Anvin wrote: >> >> Anyway, that means that the BUG_ON() is likely bogus, but so is the >> whole calling convention. >> >> The 4kB range starting at 0xfffffffffffff000 sounds like a *valid* >> range, but that requires that we fix the calling convention to not >> have that "end" (exclusive) thing. It should either be "end" >> (inclusive), or just "len". >> > > On x86, it is definitely NOT a valid range. There is no physical addresses > there, and there will never be any. This reminds me a similar issue: If you try to mmap /dev/kmem at an offset which is not kernel owned (such as 0), you'll get all the way to __pa() before getting a BUG() about addresses not making sense. How come there's no arch-specific validation of attempts to access virtual/physical addresses? In the kmem example I'd assume that something very early on should be yelling at me about doing something like that, but for some reason I get all the way to __pa() before getting a BUG() (!). Thanks, Sasha
On 09/09/2012 04:56 PM, Suresh Siddha wrote: > On Sat, 2012-09-08 at 12:57 -0700, Linus Torvalds wrote: >> > Whatever. Something like this (TOTALLY UNTESTED) attached patch should >> > get the mtdchar overflows to go away, > It looks good to me. Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> > > Sasha, can you please give this a try? Sorry for the delay. It looks good here. Thanks, Sasha
On 09/12/2012 12:50 PM, Sasha Levin wrote: > On 09/09/2012 04:56 PM, Suresh Siddha wrote: >> On Sat, 2012-09-08 at 12:57 -0700, Linus Torvalds wrote: >>>> Whatever. Something like this (TOTALLY UNTESTED) attached patch should >>>> get the mtdchar overflows to go away, >> It looks good to me. Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> >> >> Sasha, can you please give this a try? > > Sorry for the delay. It looks good here. > > > Thanks, > Sasha > Uh... sorry again, I obviously tested the second patch sent by Linus but mistakingly replied to the wrong mail in the thread.
On 09/12/2012 12:56 PM, Sasha Levin wrote: > On 09/12/2012 12:50 PM, Sasha Levin wrote: >> On 09/09/2012 04:56 PM, Suresh Siddha wrote: >>> On Sat, 2012-09-08 at 12:57 -0700, Linus Torvalds wrote: >>>>> Whatever. Something like this (TOTALLY UNTESTED) attached patch should >>>>> get the mtdchar overflows to go away, >>> It looks good to me. Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> >>> >>> Sasha, can you please give this a try? >> >> Sorry for the delay. It looks good here. >> >> >> Thanks, >> Sasha >> > > Uh... sorry again, I obviously tested the second patch sent by Linus but > mistakingly replied to the wrong mail in the thread. Is anyone planning on picking up Linus' patch? This is still not in -next even. Thanks, Sasha
On Fri, Sep 28, 2012 at 2:00 AM, Sasha Levin <levinsasha928@gmail.com> wrote: > > Is anyone planning on picking up Linus' patch? This is still not in -next even. I was really hoping it would go through the regular channels and come back to me that way, since I can't really test it, and it's bigger than the trivial obvious one-liners that I'm happy to commit. Hmm. Linus
On Fri, 2012-09-28 at 09:44 -0700, Linus Torvalds wrote: > On Fri, Sep 28, 2012 at 2:00 AM, Sasha Levin <levinsasha928@gmail.com> wrote: > > > > Is anyone planning on picking up Linus' patch? This is still not in -next even. > > I was really hoping it would go through the regular channels and come > back to me that way, since I can't really test it, and it's bigger > than the trivial obvious one-liners that I'm happy to commit. Hi Linus, I am not the maintainer, but please, go ahead an push your fix. I do not have time to test it myself and it does not look like anyone else in the small mtd community does.
On Fri, 2012-09-28 at 09:44 -0700, Linus Torvalds wrote: > On Fri, Sep 28, 2012 at 2:00 AM, Sasha Levin <levinsasha928@gmail.com> wrote: > > > > Is anyone planning on picking up Linus' patch? This is still not in -next even. > > I was really hoping it would go through the regular channels and come > back to me that way, since I can't really test it, and it's bigger > than the trivial obvious one-liners that I'm happy to commit. I can't test it on real hardware here either, but I can pull it through my tree.
On Fri, Sep 28, 2012 at 11:05 AM, Artem Bityutskiy <dedekind1@gmail.com> wrote: > > I am not the maintainer, but please, go ahead an push your fix. I do not > have time to test it myself and it does not look like anyone else in the > small mtd community does. Grr. I told people that patch wasn't tested. I hadn't even *compiled* it. It has a typo ("inlint" instead of "inline" - so close). Sasha said he had tested it, but nobody even mentioned this thing. Now I'm nervous. I had committed it in my tree and was just about to push it out when I decided that I should at least compile it despite the "tested-by". Hmm? Now I really *really* want to know that it's been tested on actual hardware too. Sasha, what patch did you actually test? Did you just fix the "inlint" thing, or was there something else entirely? Linus PS. Artem: your "Reply-to" is broken, and doesn't have your real name. Please fix.
On Fri, Sep 28, 2012 at 9:04 PM, David Woodhouse <dwmw2@infradead.org> wrote: > On Fri, 2012-09-28 at 09:44 -0700, Linus Torvalds wrote: >> On Fri, Sep 28, 2012 at 2:00 AM, Sasha Levin <levinsasha928@gmail.com> wrote: >> > >> > Is anyone planning on picking up Linus' patch? This is still not in -next even. >> >> I was really hoping it would go through the regular channels and come >> back to me that way, since I can't really test it, and it's bigger >> than the trivial obvious one-liners that I'm happy to commit. > > I can't test it on real hardware here either, but I can pull it through > my tree. If you can a easy test-case I can test it on real hardware.
On Fri, Sep 28, 2012 at 9:15 PM, richard -rw- weinberger <richard.weinberger@gmail.com> wrote: > On Fri, Sep 28, 2012 at 9:04 PM, David Woodhouse <dwmw2@infradead.org> wrote: >> On Fri, 2012-09-28 at 09:44 -0700, Linus Torvalds wrote: >>> On Fri, Sep 28, 2012 at 2:00 AM, Sasha Levin <levinsasha928@gmail.com> wrote: >>> > >>> > Is anyone planning on picking up Linus' patch? This is still not in -next even. >>> >>> I was really hoping it would go through the regular channels and come >>> back to me that way, since I can't really test it, and it's bigger >>> than the trivial obvious one-liners that I'm happy to commit. >> >> I can't test it on real hardware here either, but I can pull it through >> my tree. > > If you can a easy test-case I can test it on real hardware. Bah. If you have an easy test case I can run it on real hardware. :)
On 09/28/2012 09:13 PM, Linus Torvalds wrote: > On Fri, Sep 28, 2012 at 11:05 AM, Artem Bityutskiy <dedekind1@gmail.com> wrote: >> >> I am not the maintainer, but please, go ahead an push your fix. I do not >> have time to test it myself and it does not look like anyone else in the >> small mtd community does. > > Grr. I told people that patch wasn't tested. I hadn't even *compiled* > it. It has a typo ("inlint" instead of "inline" - so close). > > Sasha said he had tested it, but nobody even mentioned this thing. Now > I'm nervous. I had committed it in my tree and was just about to push > it out when I decided that I should at least compile it despite the > "tested-by". > > Hmm? Now I really *really* want to know that it's been tested on > actual hardware too. Sasha, what patch did you actually test? Did you > just fix the "inlint" thing, or was there something else entirely? I've just fixed the inlint thing since it was pretty trivial, I didn't bother commenting on it since I figured it would turn into an actual patch from someone who could test it on actual hardware first. Note that I've tested it on a KVM guest, and not on real hardware - so I'm not sure how much of a test that is. Thanks, Sasha
diff --git a/drivers/mtd/mtdchar.c b/drivers/mtd/mtdchar.c index f2f482b..f79c0fa 100644 --- a/drivers/mtd/mtdchar.c +++ b/drivers/mtd/mtdchar.c @@ -1132,16 +1132,15 @@ static int mtdchar_mmap(struct file *file, struct vm_area_struct *vma) struct mtd_file_info *mfi = file->private_data; struct mtd_info *mtd = mfi->mtd; struct map_info *map = mtd->priv; - unsigned long start; - unsigned long off; - u32 len; + resource_size_t start, off; + unsigned long len; if (mtd->type == MTD_RAM || mtd->type == MTD_ROM) { off = vma->vm_pgoff << PAGE_SHIFT; start = map->phys; len = PAGE_ALIGN((start & ~PAGE_MASK) + map->size); start &= PAGE_MASK; - if ((vma->vm_end - vma->vm_start + off) > len) + if (off >= len || (vma->vm_end - vma->vm_start + off) > len) return -EINVAL; off += start;