Message ID | cover.1611103406.git.thehajime@gmail.com |
---|---|
State | Not Applicable |
Headers | show |
Hi, So I'm still a bit lost here with this, and what exactly you're doing in places. For example, you simulate a single CPU ("depends on !SMP", and anyway UML only supports that right now), yet on the other hand do a *LOT* of extra work with lkl_sem, lkl_thread, lkl_mutex, and all that. It's not clear to me why? Are you trying to model kernel threads as actual userspace pthreads, but then run only one at a time by way of exclusive locking? I think we probably need a bit more architecture introduction here in the cover letter or the documentation patch. The doc patch basically just explains what it does, but not how it does anything, or why it was done in this way. For example, I'm asking myself: * Why NOMMU? UML doesn't really do _much_ with memory protection unless you add userspace, which you don't have. * Why pthreads and all? You already require jump_buf, so UML's switch_threads() ought to be just fine for scheduling? It almost seems like you're doing this just so you can serialize against "other threads" (application threads), but wouldn't that trivially be handled by the application? You could let it hook into switch_to() or something, but why should a single "LKL" CPU ever require multiple threads? Seems to me that the userspace could be required to "lkl_run()" or so (vs. lkl_start()). Heck, you could even exit lkl_run() every time you switch tasks in the kernel, and leave scheduling the kernel vs. the application entirely up to the application? (A trivial application would be simply doing something like "while (1) { lkl_run(); pause(); }" mimicking the idle loop of UML. And - kind of the theme behind all these questions - why is this not making UML actually be a binary that uses LKL? If the design were like what I'm alluding to above, that should actually be possible? Why should it not be possible? Why would it not be desirable? (I'm actually thinking that might be really useful to some of the things I'm doing.) Yes, if the application actually supports userspace running then it has som limitations on what it can do (in particular wrt. signals etc.), but that could be documented and would be OK? johannes
Hello, First of all, thanks for all the comments to the patchset which has been a bit stale. I'll reply them. On Mon, 15 Mar 2021 06:03:19 +0900, Johannes Berg wrote: > > Hi, > > So I'm still a bit lost here with this, and what exactly you're doing in > places. > > For example, you simulate a single CPU ("depends on !SMP", and anyway > UML only supports that right now), yet on the other hand do a *LOT* of > extra work with lkl_sem, lkl_thread, lkl_mutex, and all that. It's not > clear to me why? Are you trying to model kernel threads as actual > userspace pthreads, but then run only one at a time by way of exclusive > locking? > > I think we probably need a bit more architecture introduction here in > the cover letter or the documentation patch. The doc patch basically > just explains what it does, but not how it does anything, or why it was > done in this way. We didn't write down the details, which are already described in the LKL's paper (*1). But I think we can extract/summarize some of important information from the paper to the document so that the design is more understandable. *1 LKL's paper (pointer is also in the cover letter) https://www.researchgate.net/profile/Nicolae_Tapus2/publication/224164682_LKL_The_Linux_kernel_library/links/02bfe50fd921ab4f7c000000.pdf > For example, I'm asking myself: > * Why NOMMU? UML doesn't really do _much_ with memory protection unless > you add userspace, which you don't have. My interpretation of MMU/NOMMU is like this; With (emulated) MMU architecture you will have more smooth integration with other subsystems of kernel tree, because some subsystems/features are written with "#ifdef CONFIG_MMU". While NOMMU doesn't, it will bring a simplified design with better portability. LKL takes rather to benefit better portability. > * Why pthreads and all? You already require jump_buf, so UML's > switch_threads() ought to be just fine for scheduling? It almost > seems like you're doing this just so you can serialize against "other > threads" (application threads), but wouldn't that trivially be > handled by the application? You could let it hook into switch_to() or > something, but why should a single "LKL" CPU ever require multiple > threads? Seems to me that the userspace could be required to > "lkl_run()" or so (vs. lkl_start()). Heck, you could even exit > lkl_run() every time you switch tasks in the kernel, and leave > scheduling the kernel vs. the application entirely up to the > application? (A trivial application would be simply doing something > like "while (1) { lkl_run(); pause(); }" mimicking the idle loop of > UML. There is a description about this design choice in the LKL paper (*1); "implementations based on setjmp - longjmp require usage of a single stack space partitioned between all threads. As the Linux kernel uses deep stacks (especially in the VFS layer), in an environment with small stack sizes (e.g. inside another operating system's kernel) this will place a very low limit on the number of possible threads." (from page 2, Section II, 2) Thread Support) This is a reason of using pthread as a context primitive. And instead of manually doing lkl_run() to schedule threads and relying on host scheduler, LKL associates each kernel thread with a host-provided semaphore so that Linux scheduler has a control of host scheduler (prepared by pthread). This is also described (and hasn't changed since then) in the paper *1 (from page 2, Section II, 3) Thread Switching). > And - kind of the theme behind all these questions - why is this not > making UML actually be a binary that uses LKL? If the design were like > what I'm alluding to above, that should actually be possible? Why should > it not be possible? Why would it not be desirable? (I'm actually > thinking that might be really useful to some of the things I'm doing.) > Yes, if the application actually supports userspace running then it has > som limitations on what it can do (in particular wrt. signals etc.), but > that could be documented and would be OK? Let me try to describe how I think why not just generate liblinux.so from current UML. Making UML to build a library, which has been a long wanted features, can be started; I think there are several functions which the library offers; - applications can link the library and call functions in the library - the library will be used as a replacement of libc.a for syscall operations to design that with UML, what we need to do are; 1) change Makefile to output liblinux.a we faced linker script issue, which is related with generating relocatable object in the middle. 2) make the linker-script clean with 2-stage build we fix the linker issues of (1) 3) expose syscall as a function call conflicts names (link-time and compile-time conflicts) 4) header rename, object localization to fix the issue (3) This is a common set of modifications to a library of UML. Other parts are a choice of design, I believe. Because a library is more _reusable_ than an executable (by it means), the choice of LKL is to be portable, which the current UML doesn't pursue it extensibly (focus on intel platforms). Thus, 5) memory: NOMMU 6) schedule (of irq/thread): pthread-based rather than setjmp/longjmp Implementing with alternate options 5) and 6) (MMU, jmpbuf) diminishes the strength of LKL, which we would like to avoid. But as you mentioned, nothing prevents us to implement the alternate options 5) and 6) so, we can share the common part (1-4) if we will start to implement. I hope this makes it a bit clear, but let me know if you found anything unclear. -- Hajime
Hi, > First of all, thanks for all the comments to the patchset which has > been a bit stale. I'll reply them. Yeah, sorry. I had it marked unread ("to look at") since you posted it. > We didn't write down the details, which are already described in the > LKL's paper (*1). But I think we can extract/summarize some of > important information from the paper to the document so that the > design is more understandable. > > *1 LKL's paper (pointer is also in the cover letter) > https://www.researchgate.net/profile/Nicolae_Tapus2/publication/224164682_LKL_The_Linux_kernel_library/links/02bfe50fd921ab4f7c000000.pdf OK, I guess I should take a look. Probably I never did, always thinking that it was more of an overview than technical details and design decisions. > > My interpretation of MMU/NOMMU is like this; > > With (emulated) MMU architecture you will have more smooth integration > with other subsystems of kernel tree, because some subsystems/features > are written with "#ifdef CONFIG_MMU". While NOMMU doesn't, it will > bring a simplified design with better portability. > > LKL takes rather to benefit better portability. I don't think it *matters* so much for portability? I mean, every system under the sun is going to allow some kind of "mprotect", right? You don't really want to port LKL to systems that don't have even that? > > * Why pthreads and all? You already require jump_buf, so UML's > > switch_threads() ought to be just fine for scheduling? It almost > > seems like you're doing this just so you can serialize against "other > > threads" (application threads), but wouldn't that trivially be > > handled by the application? You could let it hook into switch_to() or > > something, but why should a single "LKL" CPU ever require multiple > > threads? Seems to me that the userspace could be required to > > "lkl_run()" or so (vs. lkl_start()). Heck, you could even exit > > lkl_run() every time you switch tasks in the kernel, and leave > > scheduling the kernel vs. the application entirely up to the > > application? (A trivial application would be simply doing something > > like "while (1) { lkl_run(); pause(); }" mimicking the idle loop of > > UML. > > There is a description about this design choice in the LKL paper (*1); > > "implementations based on setjmp - longjmp require usage of a single > stack space partitioned between all threads. As the Linux kernel > uses deep stacks (especially in the VFS layer), in an environment > with small stack sizes (e.g. inside another operating system's > kernel) this will place a very low limit on the number of possible > threads." > > (from page 2, Section II, 2) Thread Support) > > This is a reason of using pthread as a context primitive. That impliciation (setjmp doesnt do stacks, so must use pthread) really isn't true, you also have posix contexts or windows fibers. That would probably be much easier to understands, since real threads imply that you have actual concurrency, which _shouldn't_ be true in the case of Linux emulated as being on a single CPU. Perhaps that just means you chose the wrong abstraction. In usfstl (something I've been working on) for example, we have an abstraction called (execution) "contexts", and they can be implemented using pthreads, fibers, or posix contexts, and you switch between them. (see https://github.com/linux-test-project/usfstl/blob/main/src/ctx-common.c) Using real pthreads implies that you have real threading, but then you need access to real mutexes, etc. If your abstraction was instead "switch context" then you could still implement it using pthreads+mutexes, or you could implement it using fibers on windows, or posix contexts - but you'd have a significantly reduced API surface, since you'd only expose __switch_to() or similar, and maybe a new stack allocation etc. Additionally, I do wonder how UML does this now, it *does* use setjmp, so are you saying it doesn't properly use the kernel stacks? > And instead of manually doing lkl_run() to schedule threads and > relying on host scheduler, LKL associates each kernel thread with a > host-provided semaphore so that Linux scheduler has a control of host > scheduler (prepared by pthread). Right. That's in line with what I did in my test framework in https://github.com/linux-test-project/usfstl/blob/main/src/ctx-pthread.c but like I said above, I think it's the wrong abstraction. Your abstraction should be "switch context" (or "switch thread"), not dealing with pthread, mutexes, etc. > > And - kind of the theme behind all these questions - why is this not > > making UML actually be a binary that uses LKL? If the design were like > > what I'm alluding to above, that should actually be possible? Why should > > it not be possible? Why would it not be desirable? (I'm actually > > thinking that might be really useful to some of the things I'm doing.) > > Yes, if the application actually supports userspace running then it has > > som limitations on what it can do (in particular wrt. signals etc.), but > > that could be documented and would be OK? > > Let me try to describe how I think why not just generate liblinux.so > from current UML. > > Making UML to build a library, which has been a long wanted features, > can be started; > > > I think there are several functions which the library offers; > > - applications can link the library and call functions in the library Right. > - the library will be used as a replacement of libc.a for syscall operations Not sure I see this, is that really useful? I mean, most applications don't live "standalone" in their own world? Dunno. Maybe it's useful. > to design that with UML, what we need to do are; > > 1) change Makefile to output liblinux.a or liblinux.so, I guess, dynamic linking should be ok. > we faced linker script issue, which is related with generating > relocatable object in the middle. > > 2) make the linker-script clean with 2-stage build > we fix the linker issues of (1) > > 3) expose syscall as a function call > conflicts names (link-time and compile-time conflicts) > > 4) header rename, object localization > to fix the issue (3) > > This is a common set of modifications to a library of UML. All of this is just _build_ issues. It doesn't mean you couldn't take some minimal code + liblinux.a and link it to get a "linux" equivalent to the current UML? TBH, I started thinking that it might be _really_ nice to be able to write an application that's *not quite UML* but has all the properties of UML built into it, i.e. can run userspace etc. > Other parts are a choice of design, I believe. > Because a library is more _reusable_ than an executable (by it means), the > choice of LKL is to be portable, which the current UML doesn't pursue it > extensibly (focus on intel platforms). > I don't think this really conflicts. You could have a liblinux.a/liblinux.so and some code that links it all together to get "linux" (UML). Having userspace running inside the UML (liblinux) might only be supported on x86 for now, MMU vs. NOMMU might be something that's configurable at build time, and if you pick NOMMU you cannot run userspace either, etc. But conceptually, why wouldn't it be possible to have a liblinux.so that *does* build with MMU and userspace support, and UML is a wrapper around it? > I hope this makes it a bit clear, but let me know if you found > anything unclear. See above, I guess :) Thanks for all the discussion! johannes
On Tue, Mar 16, 2021 at 11:29 PM Johannes Berg <johannes@sipsolutions.net> wrote: > > Hi, Hi Johannes, > > My interpretation of MMU/NOMMU is like this; > > > > With (emulated) MMU architecture you will have more smooth integration > > with other subsystems of kernel tree, because some subsystems/features > > are written with "#ifdef CONFIG_MMU". While NOMMU doesn't, it will > > bring a simplified design with better portability. > > > > LKL takes rather to benefit better portability. > > I don't think it *matters* so much for portability? I mean, every system > under the sun is going to allow some kind of "mprotect", right? You > don't really want to port LKL to systems that don't have even that? > One use case where this matters are non OS environments such as bootloaders [1], running on bare-bone hardware or kernel drivers [2, 3]. IMO it would be nice to keep these properties. [1] https://www.freelists.org/post/linux-kernel-library/UEFI-LKL-port [2] https://github.com/lkl/lkl-win-fsd [3] https://www.haiku-os.org/tags/lkl-haiku-fsd/ > > > * Why pthreads and all? You already require jump_buf, so UML's > > > switch_threads() ought to be just fine for scheduling? It almost > > > seems like you're doing this just so you can serialize against "other > > > threads" (application threads), but wouldn't that trivially be > > > handled by the application? You could let it hook into switch_to() or > > > something, but why should a single "LKL" CPU ever require multiple > > > threads? Seems to me that the userspace could be required to > > > "lkl_run()" or so (vs. lkl_start()). Heck, you could even exit > > > lkl_run() every time you switch tasks in the kernel, and leave > > > scheduling the kernel vs. the application entirely up to the > > > application? (A trivial application would be simply doing something > > > like "while (1) { lkl_run(); pause(); }" mimicking the idle loop of > > > UML. > > > > There is a description about this design choice in the LKL paper (*1); > > > > "implementations based on setjmp - longjmp require usage of a single > > stack space partitioned between all threads. As the Linux kernel > > uses deep stacks (especially in the VFS layer), in an environment > > with small stack sizes (e.g. inside another operating system's > > kernel) this will place a very low limit on the number of possible > > threads." > > > > (from page 2, Section II, 2) Thread Support) > > > > This is a reason of using pthread as a context primitive. > > That impliciation (setjmp doesnt do stacks, so must use pthread) really > isn't true, you also have posix contexts or windows fibers. That would > probably be much easier to understands, since real threads imply that > you have actual concurrency, which _shouldn't_ be true in the case of > Linux emulated as being on a single CPU. > > Perhaps that just means you chose the wrong abstraction. > > In usfstl (something I've been working on) for example, we have an > abstraction called (execution) "contexts", and they can be implemented > using pthreads, fibers, or posix contexts, and you switch between them. > > (see https://github.com/linux-test-project/usfstl/blob/main/src/ctx-common.c) > > Using real pthreads implies that you have real threading, but then you > need access to real mutexes, etc. > > If your abstraction was instead "switch context" then you could still > implement it using pthreads+mutexes, or you could implement it using > fibers on windows, or posix contexts - but you'd have a significantly > reduced API surface, since you'd only expose __switch_to() or similar, > and maybe a new stack allocation etc. > You are right. When I started the implementation for ucontext it was obvious that it would be much simpler to have abstractions closer to what Linux has (alloc, free and switch threads). But I never got to finish that and then things went into a different direction. > Additionally, I do wonder how UML does this now, it *does* use setjmp, > so are you saying it doesn't properly use the kernel stacks? > To clarify a bit the statement in the paper, the context there was that we should push the thread implementation to the application/environment we run rather than providing "LKL" threads. This was particularly important for running LKL in other OSes kernel drivers. But you are right, we can use the switch abstraction and implement it with threads and mutexes for those environments where it helps. > > to design that with UML, what we need to do are; > > > > 1) change Makefile to output liblinux.a > > or liblinux.so, I guess, dynamic linking should be ok. > > > we faced linker script issue, which is related with generating > > relocatable object in the middle. > > > > 2) make the linker-script clean with 2-stage build > > we fix the linker issues of (1) > > > > 3) expose syscall as a function call > > conflicts names (link-time and compile-time conflicts) > > > > 4) header rename, object localization > > to fix the issue (3) > > > > This is a common set of modifications to a library of UML. > > All of this is just _build_ issues. It doesn't mean you couldn't take > some minimal code + liblinux.a and link it to get a "linux" equivalent > to the current UML? > > TBH, I started thinking that it might be _really_ nice to be able to > write an application that's *not quite UML* but has all the properties > of UML built into it, i.e. can run userspace etc. > > > Other parts are a choice of design, I believe. > > Because a library is more _reusable_ than an executable (by it means), the > > choice of LKL is to be portable, which the current UML doesn't pursue it > > extensibly (focus on intel platforms). > > > > I don't think this really conflicts. > > You could have a liblinux.a/liblinux.so and some code that links it all > together to get "linux" (UML). Having userspace running inside the UML > (liblinux) might only be supported on x86 for now, MMU vs. NOMMU might > be something that's configurable at build time, and if you pick NOMMU > you cannot run userspace either, etc. > > But conceptually, why wouldn't it be possible to have a liblinux.so that > *does* build with MMU and userspace support, and UML is a wrapper around > it? > This is an interesting idea. Conceptually I think it is possible. There are lots of details to be figured out before we do this. I think that having a NOMMU version could be a good step in the right direction, especially since I think a liblinux.so has more NOMMU usecases than MMU usecases - but I haven't given too much thought to the MMU usecases.
Hi, > One use case where this matters are non OS environments such as > bootloaders [1], running on bare-bone hardware or kernel drivers [2, > 3]. IMO it would be nice to keep these properties. OK, that makes sense. Still, it seems it could be a compile-time decision, and doesn't necessarily mean LKL has to be NOMMU, just that it could support both? I'm really trying to see if we can't get UML to be a user of LKL. IMHO that would be good for the code, and even be good for LKL since then it's maintained as part of UML as well, not "just" as its own use case. > > If your abstraction was instead "switch context" then you could still > > implement it using pthreads+mutexes, or you could implement it using > > fibers on windows, or posix contexts - but you'd have a significantly > > reduced API surface, since you'd only expose __switch_to() or similar, > > and maybe a new stack allocation etc. > > You are right. When I started the implementation for ucontext it was > obvious that it would be much simpler to have abstractions closer to > what Linux has (alloc, free and switch threads). But I never got to > finish that and then things went into a different direction. OK, sounds like you came to the same conclusion, more or less. > > Additionally, I do wonder how UML does this now, it *does* use setjmp, > > so are you saying it doesn't properly use the kernel stacks? > > > > To clarify a bit the statement in the paper, the context there was > that we should push the thread implementation to the > application/environment we run rather than providing "LKL" threads. > This was particularly important for running LKL in other OSes kernel > drivers. But you are right, we can use the switch abstraction and > implement it with threads and mutexes for those environments where it > helps. Right - like I pointed to USFSTL framework, you could have posix ucontext, fiber and pthread at least, and obviously other things in other environments (ThreadX anyone? ;-) ) > > But conceptually, why wouldn't it be possible to have a liblinux.so that > > *does* build with MMU and userspace support, and UML is a wrapper around > > it? > > > > This is an interesting idea. Conceptually I think it is possible. > There are lots of details to be figured out before we do this. I think > that having a NOMMU version could be a good step in the right > direction, especially since I think a liblinux.so has more NOMMU > usecases than MMU usecases - but I haven't given too much thought to > the MMU usecases. Yeah, maybe UML would be the primary use case. I have been thinking that there would be cases where you could combine kunit and having userspace though, or unit-style testing but not with kunit which is "inside" the kernel, but instead having the test code more "outside" the test kernel. That's all kind of handwaving though and not really that crystallized in my mind. That said, I'm not entirely sure NOMMU would be the right path towards this - if we do want to go this route it'll probably need changes in both LKL and UML to converge to this point, and at least build it into the abstractions. For example the "idle" abstraction discussed elsewhere (is it part of the app or part of the kernel?), or the thread discussion above (it is part of the app but how is it implemented?) etc. johannes
Hello, On Wed, 17 Mar 2021 23:24:14 +0900, Johannes Berg wrote: > > Hi, > > > One use case where this matters are non OS environments such as > > bootloaders [1], running on bare-bone hardware or kernel drivers [2, > > 3]. IMO it would be nice to keep these properties. > > OK, that makes sense. Still, it seems it could be a compile-time > decision, and doesn't necessarily mean LKL has to be NOMMU, just that it > could support both? > > I'm really trying to see if we can't get UML to be a user of LKL. IMHO > that would be good for the code, and even be good for LKL since then > it's maintained as part of UML as well, not "just" as its own use case. > > > > If your abstraction was instead "switch context" then you could still > > > implement it using pthreads+mutexes, or you could implement it using > > > fibers on windows, or posix contexts - but you'd have a significantly > > > reduced API surface, since you'd only expose __switch_to() or similar, > > > and maybe a new stack allocation etc. > > > > You are right. When I started the implementation for ucontext it was > > obvious that it would be much simpler to have abstractions closer to > > what Linux has (alloc, free and switch threads). But I never got to > > finish that and then things went into a different direction. > > OK, sounds like you came to the same conclusion, more or less. > > > > Additionally, I do wonder how UML does this now, it *does* use setjmp, > > > so are you saying it doesn't properly use the kernel stacks? > > > > > > > To clarify a bit the statement in the paper, the context there was > > that we should push the thread implementation to the > > application/environment we run rather than providing "LKL" threads. > > This was particularly important for running LKL in other OSes kernel > > drivers. But you are right, we can use the switch abstraction and > > implement it with threads and mutexes for those environments where it > > helps. > > Right - like I pointed to USFSTL framework, you could have posix > ucontext, fiber and pthread at least, and obviously other things in > other environments (ThreadX anyone? ;-) ) I also have an idea for a ThreadX in future, which also implements actual context in the application/environment/host side (not in kernel side, as others do). Though this environment may not provide mprotect-like features, there is still a value that the application can run Linux code (e.g., network stack) for instance. # This story is about our old work of network simulation. https://lwn.net/Articles/639333/ > > > But conceptually, why wouldn't it be possible to have a liblinux.so that > > > *does* build with MMU and userspace support, and UML is a wrapper around > > > it? > > > > > > > This is an interesting idea. Conceptually I think it is possible. > > There are lots of details to be figured out before we do this. I think > > that having a NOMMU version could be a good step in the right > > direction, especially since I think a liblinux.so has more NOMMU > > usecases than MMU usecases - but I haven't given too much thought to > > the MMU usecases. > > Yeah, maybe UML would be the primary use case. I have been thinking that > there would be cases where you could combine kunit and having userspace > though, or unit-style testing but not with kunit which is "inside" the > kernel, but instead having the test code more "outside" the test kernel. > That's all kind of handwaving though and not really that crystallized in > my mind. > > That said, I'm not entirely sure NOMMU would be the right path towards > this - if we do want to go this route it'll probably need changes in > both LKL and UML to converge to this point, and at least build it into > the abstractions. > > For example the "idle" abstraction discussed elsewhere (is it part of > the app or part of the kernel?), or the thread discussion above (it is > part of the app but how is it implemented?) etc. I agree that LKL (or the library mode) can conceptually offer both NOMMU/MMU capabilities. I also think that NOMMU library could be the first step and a minimum product as MMU implementation may involve a lot of refactoring which may need more consideration to the current codebase. We tried with MMU mode library, by sharing build system (Kconfig/Makefile) and runtime facilities (thread/irq/memory). But, we could only do share irq handling for this first step. When we implement the MMU mode library in future, we may come up with another abstraction/refactoring into the UML design, which could be a good outcome. But I think it is beyond the minimum given (already) big changes with the current patchset. -- Hajime
Hi, > I also have an idea for a ThreadX in future, which also implements > actual context in the application/environment/host side (not in kernel > side, as others do). Though this environment may not provide > mprotect-like features, there is still a value that the application > can run Linux code (e.g., network stack) for instance. Heh. Right. > I agree that LKL (or the library mode) can conceptually offer both > NOMMU/MMU capabilities. > > I also think that NOMMU library could be the first step and a minimum > product as MMU implementation may involve a lot of refactoring which > may need more consideration to the current codebase. > > We tried with MMU mode library, by sharing build system > (Kconfig/Makefile) and runtime facilities (thread/irq/memory). But, > we could only do share irq handling for this first step. > > When we implement the MMU mode library in future, we may come up with > another abstraction/refactoring into the UML design, which could be a > good outcome. But I think it is beyond the minimum given (already) > big changes with the current patchset. Well, arguably that depends on how you look at it. Understandably, you're looking at this from the POV of getting an "MVP" (minimum viable product) into mainline as soon as possible. I can understand why you would do that, and this patchset achieves it: you get an LKL in mainline that's useful, even if it doesn't achieve the best possible architecture and code sharing. But look at it from the opposite side, from mainline's view (at least in my opinion, others may disagree): getting an LKL (whether as an MVP or not) isn't really that important! Getting the architecture and code sharing right are likely the *primary* goals for mainline this integration. So from my POV it's *more important* to get the shared facilities, proper abstraction and refactoring right, likely to the point where UML is actually "small binary using the library" (in some fashion). Even if that initially means there actually *won't* be NOMMU mode and a library that's useful for the LKL use cases. Yes, that's the longer road into mainline, but it also means that each step along the way is actually useful to mainline, I'm assuming here that the necessary code refactoring, abstraction, etc. will by itself provide some value to UML, but given the messy state it's in, I think that's almost certainly going to be true. So a sense "getting LKL into UML" is at odds with "get LKL working quickly". However, doing it this way may ultimately get it into mainline faster because it's a much easier incremental route. Say you want to get all this thread stuff out of the way that we discussed - then if you need to keep UML working but *using* the abstraction you're adding (in order to work towards the goal of it using the library) then it becomes fairly obvious that you cannot use the abstraction that you have with pthreads, mutexes, and semaphores exposed via APIs, but need to build the API on "thread switching" primitives instead. I would expect similar things to be true for other places. Now, are you/we up for that? I don't know. On the one hand, I know you're persistent and interested in this, but on the other hand it's somewhat at odds with your goals. I believe for mainline it'd be better because the code is no worse off each step along the way. Taking the thread example again, if we have a thread switching abstraction and an implementation in UML, worst case (e.g. if you lose interest) is that it's a somewhat pointless abstraction there, but it doesn't really make the code significantly worse or more complex. OTOH, having what we have now with pthreads/mutexes/semaphores *does* make the code significantly more complex and harder to maintain (IMHO) because it adds all kinds of special cases, and they're somewhat more difficult to exercise (yes, there are examples, still). In any case, I don't think I'm the one making the decisions here, so take this with a grain of salt. johannes