Message ID | 200812190208.mBJ28vn13713@elf.torek.net |
---|---|
State | Changes Requested |
Delegated to: | David Miller |
Headers | show |
From: Chris Torek <chris.torek@windriver.com> Date: Thu, 18 Dec 2008 19:08:57 -0700 > I'll send this formatted against the old kernel, not sparc-next > (did get the latter all cloned up, but have not yet started working > with it). > > This came up in testing kgdb, using the built-in tests -- turn > on CONFIG_KGDB_TESTS, then > > echo V1 > /sys/module/kgdbts/parameters/kgdbts > > -- but it would affect using kgdb if you were debugging and looking > at bad pointers. > > You will probably want to tweak it anyway (especially the comment > in the .S file: I thought use of KERNEL_DS was more rare than it > is, though it probably *should* be more rare than it is...). This is going to perform really bad, and it's common for syscalls invoked by the kernel to trigger this case. I'll have to think about this, thanks for pointing out the issue. -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Chris Torek <chris.torek@windriver.com> Date: Thu, 18 Dec 2008 20:11:26 -0700 > >This is going to perform really bad ... > > Yes, I went for space on the theory that it was not common... > > >and it's common for > >syscalls invoked by the kernel to trigger this case. > > ... but so much for that theory. :-) At least it used to be the case, it may not matter today and I'll do some investigation. If it doesn't matter, your patch is fine and I'll apply it directly. > >I'll have to think about this, thanks for pointing out the > >issue. > > The other obvious thing is to have a bunch of userlike memcpy > variants, one for each *memcpy.S file. The macros you have > should make this straightforward. I was mostly worried about > breaking stuff. I'm trending away from that because it's unnecessary bloat. I wish memcpy() returned void, that would make so many things like this easy to handle. I could just add the annotations to the real memcpy, always return the same return values as the user copies do, and the memcpy() callers would just ignore the return value. But things are what they are :-) -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: David Miller <davem@davemloft.net> Date: Thu, 18 Dec 2008 19:15:16 -0800 (PST) > From: Chris Torek <chris.torek@windriver.com> > Date: Thu, 18 Dec 2008 20:11:26 -0700 > > > >This is going to perform really bad ... > > > > Yes, I went for space on the theory that it was not common... > > > > >and it's common for > > >syscalls invoked by the kernel to trigger this case. > > > > ... but so much for that theory. :-) > > At least it used to be the case, it may not matter today and > I'll do some investigation. If it doesn't matter, your patch > is fine and I'll apply it directly. Ok, I finally had a chance to look things over and it turns out the performance of this case does in fact matter. A few examples that might copy lots of big chunks are: 1) ELF core dumping 2) NFSD file read and write 3) kernel_sendmsg() and kernel_recvmsg() which are used by things like NFS client sunrpc, and other network filesystem implementations (basically, grep the tree for "set_fs(KERNEL_DS)") So I started thinking about alternative ways to fix this. There might be some way we can annotate the regular memcpy() implementations with exception handling markers, like we do for the user copy cases, such that they only trigger the alternative return value handling when a special thread local flag has been set. Or something like that. Actually, we could do something really simple since we have control over the encoding of the exception table entries _and_ we have a simple way to determine if we are doing a potentially exception causing kernel memcpy(). First, if the %o7 or %i7 are inside of the function memcpy_user_stub, then we are doing one of these in-kernel copies that are allowed to fault. This elides the need for a special thread flag or anything like that. Second, we add the annotations to the regular memcpy() routines, but we set the "fixup" field of the exception table entries to have the low bit set. This works because all instructions are 4 byte aligned on sparc. Then the code in do_kernel_fault() can, if it sees a "fixup" value with the low bit set, check %o7/%i7 for being inside range of memcpy_user_stub. If so, we jump to the fixup. Otherwise we behave as if we did not find a matching exception table entry. I'll see if I can whip up an implementation of this. Thanks for your patience Chris. -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Chris Torek <chris.torek@windriver.com> Date: Mon, 29 Dec 2008 14:55:40 -0700 > >There might be some way we can annotate the regular memcpy() > >implementations with exception handling markers, like we do > >for the user copy cases, such that they only trigger the > >alternative return value handling when a special thread local > >flag has been set. > > I thought about this, but it's ugly and creates (at least potential) > problems with recursion / interrupts. (Not sure if these are possible > in the Linux kernel organization, just "icky on general principles".) Yes, it's not very clean. > >First, if the %o7 or %i7 are inside of the function memcpy_user_stub, > >then we are doing one of these in-kernel copies that are allowed to > >fault. This elides the need for a special thread flag or anything > >like that. > > This, on the other hand, seems quite promising: as with the exception > table itself, we use the program counter (of the return address, > in this case) as the flag. > > >Then the code in do_kernel_fault() can, if it sees a "fixup" value > >with the low bit set, check %o7/%i7 for being inside range of > >memcpy_user_stub. If so, we jump to the fixup. Otherwise we > >behave as if we did not find a matching exception table entry. > > This should work really well. In the end I decided that this other approach is very ad-hoc as well. I ended up settling on something similar to your approach, and I also implemented your .fixup space optimizations. Patch series posting forthcoming... -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/sparc64/lib/copy_in_user.S b/arch/sparc64/lib/copy_in_user.S index 650af3f..ba91c3e 100644 --- a/arch/sparc64/lib/copy_in_user.S +++ b/arch/sparc64/lib/copy_in_user.S @@ -105,15 +105,24 @@ ___copy_in_user: /* %o0=dst, %o1=src, %o2=len */ /* Act like copy_{to,in}_user(), ie. return zero instead * of original destination pointer. This is invoked when * copy_{to,in}_user() finds that %asi is kernel space. + * + * We can't quite use memcpy directly since we are supposed + * to return nonzero on faults. Could optimize these, but + * this is used extremely rarely (probe_kernel_read/write). */ .globl memcpy_user_stub .type memcpy_user_stub,#function memcpy_user_stub: - save %sp, -192, %sp - mov %i0, %o0 - mov %i1, %o1 - call memcpy - mov %i2, %o2 - ret - restore %g0, %g0, %o0 + brz,pn %o2, 2f + clr %o3 +1: + EX(ldub [%o1 + %o3], %g1) + EX(stb %g1, [%o0 + %o3]) + deccc %o2 + bgu,pt %XCC, 1b + inc %o3 +2: + retl + clr %o0 + .size memcpy_user_stub, .-memcpy_user_stub