Message ID | 1320543320-32728-1-git-send-email-agraf@suse.de |
---|---|
State | New |
Headers | show |
Am 06.11.2011 02:35, schrieb Alexander Graf: > On LinuxCon I had a nice chat with Linus on what he thinks kvm-tool > would be doing and what he expects from it. Basically he wants a > small and simple tool he and other developers can run to try out and > see if the kernel they just built actually works. > > Fortunately, QEMU can do that today already! The only piece that was > missing was the "simple" piece of the equation, so here is a script > that wraps around QEMU and executes a kernel you just built. > > If you do have KVM around and are not cross-compiling, it will use > KVM. But if you don't, you can still fall back to emulation mode and > at least check if your kernel still does what you expect. I only > implemented support for s390x and ppc there, but it's easily extensible > to more platforms, as QEMU can emulate (and virtualize) pretty much > any platform out there. > > If you don't have qemu installed, please do so before using this script. Your > distro should provide a package for it (might even call it "kvm"). If not, > just compile it from source - it's not hard! > > To quickly get going, just execute the following as user: > > $ ./Documentation/run-qemu.sh -r / -a init=/bin/bash Path needs updating. > > This will drop you into a shell on your rootfs. > > Happy hacking! > > Signed-off-by: Alexander Graf <agraf@suse.de> > > --- > diff --git a/tools/testing/run-qemu/run-qemu.sh b/tools/testing/run-qemu/run-qemu.sh > new file mode 100755 > index 0000000..70f194f > --- /dev/null > +++ b/tools/testing/run-qemu/run-qemu.sh > +# Try to find the KVM accelerated QEMU binary > + > +[ "$ARCH" ] || ARCH=$(uname -m) > +case $ARCH in > +x86_64) > + KERNEL_BIN=arch/x86/boot/bzImage > + # SUSE and Red Hat call the binary qemu-kvm > + [ "$QEMU_BIN" ] || QEMU_BIN=$(which qemu-kvm 2>/dev/null) > + > + # Debian and Gentoo call it kvm > + [ "$QEMU_BIN" ] || QEMU_BIN=$(which kvm 2>/dev/null) > + > + # QEMU's own build system calls it qemu-system-x86_64 > + [ "$QEMU_BIN" ] || QEMU_BIN=$(which qemu-system-x86_64 2>/dev/null) > + ;; > +i*86) > + KERNEL_BIN=arch/x86/boot/bzImage > + # SUSE and Red Hat call the binary qemu-kvm > + [ "$QEMU_BIN" ] || QEMU_BIN=$(which qemu-kvm 2>/dev/null) > + > + # Debian and Gentoo call it kvm > + [ "$QEMU_BIN" ] || QEMU_BIN=$(which kvm 2>/dev/null) > + > + KERNEL_BIN=arch/x86/boot/bzImage Copy&paste? > + # i386 version of QEMU QEMU's own build system calls it qemu-system-i386 now. :) > + [ "$QEMU_BIN" ] || QEMU_BIN=$(which qemu 2>/dev/null) We should first test for qemu-system-i386, then fall back to old qemu. Andreas P.S. You're still ahead of time...
Hi Alexander, On Sun, Nov 6, 2011 at 3:35 AM, Alexander Graf <agraf@suse.de> wrote: > On LinuxCon I had a nice chat with Linus on what he thinks kvm-tool > would be doing and what he expects from it. Basically he wants a > small and simple tool he and other developers can run to try out and > see if the kernel they just built actually works. > > Fortunately, QEMU can do that today already! The only piece that was > missing was the "simple" piece of the equation, so here is a script > that wraps around QEMU and executes a kernel you just built. I'm happy to see some real competition for the KVM tool in usability. ;-) That said, while the script looks really useful for developers, wouldn't it make more sense to put it in QEMU to make sure it's kept up-to-date and distributions can pick it up too? (And yes, I realize the irony here.) Pekka
On 11/06/2011 12:04 PM, Pekka Enberg wrote: > Hi Alexander, > > On Sun, Nov 6, 2011 at 3:35 AM, Alexander Graf <agraf@suse.de> wrote: > > On LinuxCon I had a nice chat with Linus on what he thinks kvm-tool > > would be doing and what he expects from it. Basically he wants a > > small and simple tool he and other developers can run to try out and > > see if the kernel they just built actually works. > > > > Fortunately, QEMU can do that today already! The only piece that was > > missing was the "simple" piece of the equation, so here is a script > > that wraps around QEMU and executes a kernel you just built. > > I'm happy to see some real competition for the KVM tool in usability. ;-) > > That said, while the script looks really useful for developers, > wouldn't it make more sense to put it in QEMU to make sure it's kept > up-to-date and distributions can pick it up too? (And yes, I realize > the irony here.) Why would distributions want it? It's only useful for kernel developers.
On Sun, Nov 6, 2011 at 12:07 PM, Avi Kivity <avi@redhat.com> wrote: >> I'm happy to see some real competition for the KVM tool in usability. ;-) >> >> That said, while the script looks really useful for developers, >> wouldn't it make more sense to put it in QEMU to make sure it's kept >> up-to-date and distributions can pick it up too? (And yes, I realize >> the irony here.) > > Why would distributions want it? It's only useful for kernel developers. It's useful for kernel testers too. If this is a serious attempt in making QEMU command line suck less on Linux, I think it makes sense to do this properly instead of adding a niche script to the kernel tree that's simply going to bit rot over time. Pekka
On 11/06/2011 12:12 PM, Pekka Enberg wrote: > On Sun, Nov 6, 2011 at 12:07 PM, Avi Kivity <avi@redhat.com> wrote: > >> I'm happy to see some real competition for the KVM tool in usability. ;-) > >> > >> That said, while the script looks really useful for developers, > >> wouldn't it make more sense to put it in QEMU to make sure it's kept > >> up-to-date and distributions can pick it up too? (And yes, I realize > >> the irony here.) > > > > Why would distributions want it? It's only useful for kernel developers. > > It's useful for kernel testers too. Well, they usually have a kernel with them. > If this is a serious attempt in making QEMU command line suck less on > Linux, I think it makes sense to do this properly instead of adding a > niche script to the kernel tree that's simply going to bit rot over > time. You misunderstand. This is an attempt to address the requirements of a niche population, kernel developers and testers, not to improve the qemu command line. For the majority of qemu installations, this script is useless. In most installations, qemu is driven by other programs, so any changes to the command line would be invisible, except insofar as they break things. For the occasional direct user of qemu, something like 'qemu-kvm -m 1G /images/blah.img' is enough to boot an image. This script doesn't help in any way. This script is for kernel developers who don't want to bother with setting up a disk image (which, btw, many are still required to do - I'm guessing most kernel developers who use qemu are cross-arch). It has limited scope and works mostly by hiding qemu features. As such it doesn't belong in qemu.
Hi Avi, On Sun, 2011-11-06 at 12:23 +0200, Avi Kivity wrote: > > If this is a serious attempt in making QEMU command line suck less on > > Linux, I think it makes sense to do this properly instead of adding a > > niche script to the kernel tree that's simply going to bit rot over > > time. > > You misunderstand. This is an attempt to address the requirements of a > niche population, kernel developers and testers, not to improve the qemu > command line. For the majority of qemu installations, this script is > useless. Right. On Sun, 2011-11-06 at 12:23 +0200, Avi Kivity wrote: > In most installations, qemu is driven by other programs, so any changes > to the command line would be invisible, except insofar as they break things. > > For the occasional direct user of qemu, something like 'qemu-kvm -m 1G > /images/blah.img' is enough to boot an image. This script doesn't help > in any way. > > This script is for kernel developers who don't want to bother with > setting up a disk image (which, btw, many are still required to do - I'm > guessing most kernel developers who use qemu are cross-arch). It has > limited scope and works mostly by hiding qemu features. As such it > doesn't belong in qemu. I'm certainly not against merging the script if people are actually using it and it solves their problem. I personally find the whole exercise pointless because it's not attempting to solve any of the fundamental issues QEMU command line interface has nor does it try to make Linux on Linux virtualization simpler and more integrated. People seem to think the KVM tool is only about solving a specific problem to kernel developers. That's certainly never been my goal as I do lots of userspace programming as well. The end game for me is to replace QEMU/VirtualBox for Linux on Linux virtualization for my day to day purposes. Pekka
On 11/06/2011 01:08 PM, Pekka Enberg wrote: > On Sun, 2011-11-06 at 12:23 +0200, Avi Kivity wrote: > > In most installations, qemu is driven by other programs, so any changes > > to the command line would be invisible, except insofar as they break things. > > > > For the occasional direct user of qemu, something like 'qemu-kvm -m 1G > > /images/blah.img' is enough to boot an image. This script doesn't help > > in any way. > > > > This script is for kernel developers who don't want to bother with > > setting up a disk image (which, btw, many are still required to do - I'm > > guessing most kernel developers who use qemu are cross-arch). It has > > limited scope and works mostly by hiding qemu features. As such it > > doesn't belong in qemu. > > I'm certainly not against merging the script if people are actually > using it and it solves their problem. > > I personally find the whole exercise pointless because it's not > attempting to solve any of the fundamental issues QEMU command line > interface There are no "fundamental qemu command line issues". It's hairy, yes, and verbose, but using "fundamental" to describe a choice between one arcane set command line options and another is a bit of overstatement. Most users will use a GUI anyway. > has nor does it try to make Linux on Linux virtualization > simpler and more integrated. So far, kvm-tool capabilities are a subset of qemu's. Does it add anything beyond a different command-line? > People seem to think the KVM tool is only about solving a specific > problem to kernel developers. That's certainly never been my goal as I > do lots of userspace programming as well. The end game for me is to > replace QEMU/VirtualBox for Linux on Linux virtualization for my day to > day purposes. Maybe it should be in tools/pekka then. Usually subsystems that want to be merged into Linux have broaded audiences though.
On Sun, Nov 6, 2011 at 1:50 PM, Avi Kivity <avi@redhat.com> wrote: >> People seem to think the KVM tool is only about solving a specific >> problem to kernel developers. That's certainly never been my goal as I >> do lots of userspace programming as well. The end game for me is to >> replace QEMU/VirtualBox for Linux on Linux virtualization for my day to >> day purposes. > > Maybe it should be in tools/pekka then. Usually subsystems that want to > be merged into Linux have broaded audiences though. I think you completely missed my point. I'm simply saying that KVM tool was never about solving a narrow problem Alexander's script is trying to solve. That's why I feel it's such a pointless exercise. Pekka
On 11/06/2011 02:14 PM, Pekka Enberg wrote: > On Sun, Nov 6, 2011 at 1:50 PM, Avi Kivity <avi@redhat.com> wrote: > >> People seem to think the KVM tool is only about solving a specific > >> problem to kernel developers. That's certainly never been my goal as I > >> do lots of userspace programming as well. The end game for me is to > >> replace QEMU/VirtualBox for Linux on Linux virtualization for my day to > >> day purposes. > > > > Maybe it should be in tools/pekka then. Usually subsystems that want to > > be merged into Linux have broaded audiences though. > > I think you completely missed my point. > > I'm simply saying that KVM tool was never about solving a narrow > problem Alexander's script is trying to solve. That's why I feel it's > such a pointless exercise. But from your description, you're trying to solve just another narrow problem: "The end game for me is to replace QEMU/VirtualBox for Linux on Linux virtualization for my day to day purposes. " We rarely merge a subsystem to solve one person's problem (esp. when it is defined as "replace another freely available project", even if you dislike its command line syntax).
On Sun, Nov 6, 2011 at 1:50 PM, Avi Kivity <avi@redhat.com> wrote: > So far, kvm-tool capabilities are a subset of qemu's. Does it add > anything beyond a different command-line? I think "different command line" is a big thing which is why we've spent so much time on it. But if you mean other end user features, no, we don't add anything new on the table right now. I think our userspace networking implementation is better than QEMU's slirp but that's purely technical thing. I also don't think we should add new features for their own sake. Linux virtualization isn't a terribly difficult thing to do thanks to KVM and virtio drivers. I think most of the big ticket items will be doing things like improving guest isolation and making guests more accessible to the host. Pekka
On Sun, Nov 6, 2011 at 2:27 PM, Avi Kivity <avi@redhat.com> wrote: > But from your description, you're trying to solve just another narrow > problem: > > "The end game for me is to replace QEMU/VirtualBox for Linux on Linux > virtualization for my day to day purposes. " > > We rarely merge a subsystem to solve one person's problem (esp. when it > is defined as "replace another freely available project", even if you > dislike its command line syntax). I really don't understand your point. Other people are using the KVM tool for other purposes. For example, the (crazy) simulation guys are using the tool to launch even more guests on a single host and Ingo seems to be using the tool to test kernels. I'm not suggesting we should merge the tool because of my particular use case. I'm simply saying the problem I personally want to solve with the KVM tool is broader than what Alexander's script is doing. That's why I feel it's a pointless project. Pekka
On 11/06/2011 02:32 PM, Pekka Enberg wrote: > On Sun, Nov 6, 2011 at 2:27 PM, Avi Kivity <avi@redhat.com> wrote: > > But from your description, you're trying to solve just another narrow > > problem: > > > > "The end game for me is to replace QEMU/VirtualBox for Linux on Linux > > virtualization for my day to day purposes. " > > > > We rarely merge a subsystem to solve one person's problem (esp. when it > > is defined as "replace another freely available project", even if you > > dislike its command line syntax). > > I really don't understand your point. Other people are using the KVM > tool for other purposes. For example, the (crazy) simulation guys are > using the tool to launch even more guests on a single host and Ingo > seems to be using the tool to test kernels. > > I'm not suggesting we should merge the tool because of my particular > use case. I'm simply saying the problem I personally want to solve > with the KVM tool is broader than what Alexander's script is doing. > That's why I feel it's a pointless project. We're going in circles, but I'll try again. You say that kvm-tool's scope is broader than Alex's script, therefore the latter is pointless. You accept that qemu's scope is broader than kvm-tool (and is a superset). That is why many people think kvm-tool is pointless. Alex's script, though, is just a few dozen lines. kvm-tool is a 20K patch - in fact 2X as large as kvm when it was first merged. And it's main feature seems to be that "it is not qemu".
On Sun, Nov 6, 2011 at 2:43 PM, Avi Kivity <avi@redhat.com> wrote: > You say that kvm-tool's scope is broader than Alex's script, therefore > the latter is pointless. I'm saying that Alex's script is pointless because it's not attempting to fix the real issues. For example, we're trying to make make it as easy as possible to setup a guest and to be able to access guest data from the host. Alex's script is essentially just a simplified QEMU "front end" for kernel developers. That's why I feel it's a pointless thing to do. On Sun, Nov 6, 2011 at 2:43 PM, Avi Kivity <avi@redhat.com> wrote: > You accept that qemu's scope is broader than kvm-tool (and is a > superset). That is why many people think kvm-tool is pointless. Sure. I think it's mostly people that are interested in non-Linux virtualization that think the KVM tool is a pointless project. However, some people (including myself) think the KVM tool is a more usable and hackable tool than QEMU for Linux virtualization. The difference here is that although I feel Alex's script is a pointless project, I'm in no way opposed to merging it in the tree if people use it and it solves their problem. Some people seem to be violently opposed to merging the KVM tool and I'm having difficult time understanding why that is. Pekka
On Sun, Nov 6, 2011 at 2:43 PM, Avi Kivity <avi@redhat.com> wrote: > Alex's script, though, is just a few dozen lines. kvm-tool is a 20K > patch - in fact 2X as large as kvm when it was first merged. And it's > main feature seems to be that "it is not qemu". I think I've mentioned many times that I find the QEMU source terribly difficult to read and hack on. So if you mean "not qemu" from that point of view, sure, I think it's a very important point. The command line interface is also "not qemu" for a very good reason too. As for virtio drivers and such, we're actually following QEMU's example very closely. I guess we're going to diverge a bit for better guest isolation but fundamentally I don't see why we'd want to be totally different from QEMU on that level. Pekka
On 11/06/2011 03:06 PM, Pekka Enberg wrote: > On Sun, Nov 6, 2011 at 2:43 PM, Avi Kivity <avi@redhat.com> wrote: > > You say that kvm-tool's scope is broader than Alex's script, therefore > > the latter is pointless. > > I'm saying that Alex's script is pointless because it's not attempting > to fix the real issues. For example, we're trying to make make it as > easy as possible to setup a guest and to be able to access guest data > from the host. Have you tried virt-install/virt-manager? > Alex's script is essentially just a simplified QEMU > "front end" for kernel developers. AFAIR it was based off a random Linus remark. > That's why I feel it's a pointless thing to do. > > On Sun, Nov 6, 2011 at 2:43 PM, Avi Kivity <avi@redhat.com> wrote: > > You accept that qemu's scope is broader than kvm-tool (and is a > > superset). That is why many people think kvm-tool is pointless. > > Sure. I think it's mostly people that are interested in non-Linux > virtualization that think the KVM tool is a pointless project. > However, some people (including myself) think the KVM tool is a more > usable and hackable tool than QEMU for Linux virtualization. More hackable, certainly, as any 20kloc project will be compared to a 700+kloc project with a long history. More usable, I really doubt this. You take it for granted that people want to run their /boot kernels in a guest, but in fact only kernel developers (and testers) want this. The majority want the real guest kernel. > The difference here is that although I feel Alex's script is a > pointless project, I'm in no way opposed to merging it in the tree if > people use it and it solves their problem. Some people seem to be > violently opposed to merging the KVM tool and I'm having difficult > time understanding why that is. One of the reasons is that if it is merge, anyone with a #include <linux/foo.h> will line up for the next merge window, wanting in. The other is that anything in the Linux source tree might gain an unfair advantage over out-of-tree projects (at least that's how I read Jan's comment).
On 2011-11-06 14:06, Pekka Enberg wrote: > Sure. I think it's mostly people that are interested in non-Linux > virtualization that think the KVM tool is a pointless project. > However, some people (including myself) think the KVM tool is a more > usable and hackable tool than QEMU for Linux virtualization. "Hackable" is relative. I'm surly not saying QEMU has nicer code than kvm-tool, rather the contrary. But if it were that bad, we would not have hundreds of contributors, just in the very recent history. "Usable" - I've tried kvm-tool several times and still (today) fail to get a standard SUSE image (with a kernel I have to compile and provide separately...) up and running *). Likely a user mistake, but none that is very obvious. At least to me. In contrast, you can throw arbitrary Linux distros in various forms at QEMU, and it will catch and run them. For me, already this is more usable. Jan *) kvm run -m 1000 -d OpenSuse11-4_64.img arch/x86/boot/bzImage \ -p root=/dev/vda2 ... [ 1.772791] mousedev: PS/2 mouse device common for all mice [ 1.774603] cpuidle: using governor ladder [ 1.775490] cpuidle: using governor menu [ 1.776865] input: AT Raw Set 2 keyboard as /devices/platform/i8042/serio0/input/input0 [ 1.778609] TCP cubic registered [ 1.779456] Installing 9P2000 support [ 1.782390] Registering the dns_resolver key type [ 1.794323] registered taskstats version 1 ...and here the boot just stops, guest apparently waits for something
Hi Jan, On Sun, Nov 6, 2011 at 6:19 PM, Jan Kiszka <jan.kiszka@web.de> wrote: > "Usable" - I've tried kvm-tool several times and still (today) fail to > get a standard SUSE image (with a kernel I have to compile and provide > separately...) up and running *). Likely a user mistake, but none that > is very obvious. At least to me. > > In contrast, you can throw arbitrary Linux distros in various forms at > QEMU, and it will catch and run them. For me, already this is more usable. > > *) kvm run -m 1000 -d OpenSuse11-4_64.img arch/x86/boot/bzImage \ > -p root=/dev/vda2 > ... > [ 1.772791] mousedev: PS/2 mouse device common for all mice > [ 1.774603] cpuidle: using governor ladder > [ 1.775490] cpuidle: using governor menu > [ 1.776865] input: AT Raw Set 2 keyboard as > /devices/platform/i8042/serio0/input/input0 > [ 1.778609] TCP cubic registered > [ 1.779456] Installing 9P2000 support > [ 1.782390] Registering the dns_resolver key type > [ 1.794323] registered taskstats version 1 > > ...and here the boot just stops, guest apparently waits for something Can you please share your kernel .config with me and I'll take a look at it. We now have a "make kvmconfig" makefile target for enabling all the necessary config options for guest kernels. I don't think any of us developers are using SUSE so it can surely be a KVM tool bug as well. Pekka
Hi Avi, On Sun, Nov 6, 2011 at 5:56 PM, Avi Kivity <avi@redhat.com> wrote: > On 11/06/2011 03:06 PM, Pekka Enberg wrote: >> On Sun, Nov 6, 2011 at 2:43 PM, Avi Kivity <avi@redhat.com> wrote: >> > You say that kvm-tool's scope is broader than Alex's script, therefore >> > the latter is pointless. >> >> I'm saying that Alex's script is pointless because it's not attempting >> to fix the real issues. For example, we're trying to make make it as >> easy as possible to setup a guest and to be able to access guest data >> from the host. > > Have you tried virt-install/virt-manager? No, I don't use virtio-manager. I know a lot of people do which is why someone is working on KVM tool libvirt integration. >> On Sun, Nov 6, 2011 at 2:43 PM, Avi Kivity <avi@redhat.com> wrote: >> > You accept that qemu's scope is broader than kvm-tool (and is a >> > superset). That is why many people think kvm-tool is pointless. >> >> Sure. I think it's mostly people that are interested in non-Linux >> virtualization that think the KVM tool is a pointless project. >> However, some people (including myself) think the KVM tool is a more >> usable and hackable tool than QEMU for Linux virtualization. > > More hackable, certainly, as any 20kloc project will be compared to a > 700+kloc project with a long history. More usable, I really doubt > this. You take it for granted that people want to run their /boot > kernels in a guest, but in fact only kernel developers (and testers) > want this. The majority want the real guest kernel. Our inability to boot ISO images, for example, is a usability limitation, sure. I'm hoping to fix that at some point. >> The difference here is that although I feel Alex's script is a >> pointless project, I'm in no way opposed to merging it in the tree if >> people use it and it solves their problem. Some people seem to be >> violently opposed to merging the KVM tool and I'm having difficult >> time understanding why that is. > > One of the reasons is that if it is merge, anyone with a #include > <linux/foo.h> will line up for the next merge window, wanting in. The > other is that anything in the Linux source tree might gain an unfair > advantage over out-of-tree projects (at least that's how I read Jan's > comment). Well, having gone through the process of getting something included so far, I'm not at all worried that there's going to be a huge queue of "#include <linux/foo.h>" projects if we get in... What kind of unfair advantage are you referring to? I've specifically said that the only way for KVM tool to become a reference implementation would be that the KVM maintainers take the tool through their tree. As that's not going to happen, I don't see what the problem would be. Pekka
On Sun, Nov 6, 2011 at 6:19 PM, Jan Kiszka <jan.kiszka@web.de> wrote: > In contrast, you can throw arbitrary Linux distros in various forms at > QEMU, and it will catch and run them. For me, already this is more usable. Yes, I completely agree that this is an unfortunate limitation in the KVM tool. We definitely need to support booting to images which have virtio drivers enabled. Pekka
On 2011-11-06 17:30, Pekka Enberg wrote: > Hi Jan, > > On Sun, Nov 6, 2011 at 6:19 PM, Jan Kiszka <jan.kiszka@web.de> wrote: >> "Usable" - I've tried kvm-tool several times and still (today) fail to >> get a standard SUSE image (with a kernel I have to compile and provide >> separately...) up and running *). Likely a user mistake, but none that >> is very obvious. At least to me. >> >> In contrast, you can throw arbitrary Linux distros in various forms at >> QEMU, and it will catch and run them. For me, already this is more usable. >> >> *) kvm run -m 1000 -d OpenSuse11-4_64.img arch/x86/boot/bzImage \ >> -p root=/dev/vda2 >> ... >> [ 1.772791] mousedev: PS/2 mouse device common for all mice >> [ 1.774603] cpuidle: using governor ladder >> [ 1.775490] cpuidle: using governor menu >> [ 1.776865] input: AT Raw Set 2 keyboard as >> /devices/platform/i8042/serio0/input/input0 >> [ 1.778609] TCP cubic registered >> [ 1.779456] Installing 9P2000 support >> [ 1.782390] Registering the dns_resolver key type >> [ 1.794323] registered taskstats version 1 >> >> ...and here the boot just stops, guest apparently waits for something > > Can you please share your kernel .config with me and I'll take a look > at it. We now have a "make kvmconfig" makefile target for enabling all > the necessary config options for guest kernels. I don't think any of > us developers are using SUSE so it can surely be a KVM tool bug as > well. Attached. Jan
On 11/06/2011 06:35 PM, Pekka Enberg wrote: > >> The difference here is that although I feel Alex's script is a > >> pointless project, I'm in no way opposed to merging it in the tree if > >> people use it and it solves their problem. Some people seem to be > >> violently opposed to merging the KVM tool and I'm having difficult > >> time understanding why that is. > > > > One of the reasons is that if it is merge, anyone with a #include > > <linux/foo.h> will line up for the next merge window, wanting in. The > > other is that anything in the Linux source tree might gain an unfair > > advantage over out-of-tree projects (at least that's how I read Jan's > > comment). > > Well, having gone through the process of getting something included so > far, I'm not at all worried that there's going to be a huge queue of > "#include <linux/foo.h>" projects if we get in... > > What kind of unfair advantage are you referring to? I've specifically > said that the only way for KVM tool to become a reference > implementation would be that the KVM maintainers take the tool through > their tree. As that's not going to happen, I don't see what the > problem would be. I'm not personally worried about it either (though in fact a *minimal* reference implementation might not be a bad idea). There's the risk of getting informed in-depth press reviews ("Linux KVM Takes A Step Back From Running Windows Guests"), or of unfairly drawing developers away from competing projects.
On 11/06/2011 10:50 AM, Avi Kivity wrote: > On 11/06/2011 06:35 PM, Pekka Enberg wrote: >>>> The difference here is that although I feel Alex's script is a >>>> pointless project, I'm in no way opposed to merging it in the tree if >>>> people use it and it solves their problem. Some people seem to be >>>> violently opposed to merging the KVM tool and I'm having difficult >>>> time understanding why that is. >>> >>> One of the reasons is that if it is merge, anyone with a #include >>> <linux/foo.h> will line up for the next merge window, wanting in. The >>> other is that anything in the Linux source tree might gain an unfair >>> advantage over out-of-tree projects (at least that's how I read Jan's >>> comment). >> >> Well, having gone through the process of getting something included so >> far, I'm not at all worried that there's going to be a huge queue of >> "#include<linux/foo.h>" projects if we get in... >> >> What kind of unfair advantage are you referring to? I've specifically >> said that the only way for KVM tool to become a reference >> implementation would be that the KVM maintainers take the tool through >> their tree. As that's not going to happen, I don't see what the >> problem would be. > > I'm not personally worried about it either (though in fact a *minimal* > reference implementation might not be a bad idea). There's the risk of > getting informed in-depth press reviews ("Linux KVM Takes A Step Back > From Running Windows Guests"), or of unfairly drawing developers away > from competing projects. I don't think that's really a concern. Competition is a good thing. QEMU is a large code base that a lot of people rely upon. It's hard to take big risks in a project like QEMU because the consequences are too high. OTOH, a project like KVM tool can take a lot of risks. They've attempted a very different command line syntax and they've put a lot of work into making virtio-9p a main part of the interface. If it turns out that these things end up working out well for them, then it becomes something we can copy in QEMU. If not, then we didn't go through the train wreck of totally changing CLI syntax only to find it was the wrong syntax. I'm quite happy with KVM tool and hope they continue working on it. My only real wish is that they wouldn't copy QEMU so much and would try bolder things that are fundamentally different from QEMU. Regards, Anthony Liguori >
On 06.11.2011, at 05:11, Pekka Enberg wrote: > On Sun, Nov 6, 2011 at 2:43 PM, Avi Kivity <avi@redhat.com> wrote: >> Alex's script, though, is just a few dozen lines. kvm-tool is a 20K >> patch - in fact 2X as large as kvm when it was first merged. And it's >> main feature seems to be that "it is not qemu". > > I think I've mentioned many times that I find the QEMU source terribly > difficult to read and hack on. So if you mean "not qemu" from that > point of view, sure, I think it's a very important point. The command > line interface is also "not qemu" for a very good reason too. That's a matter of taste. In fact, I like the QEMU source code for most parts and there was a whole talk around it on LinuxCon where people agreed that it was really easy to hack away with to prototype new hardware: https://events.linuxfoundation.org/events/linuxcon-europe/waskiewicz As for all matters concerning taste, I don't think we would ever get to a common ground here :). Alex
On 11/06/2011 07:06 AM, Pekka Enberg wrote: > On Sun, Nov 6, 2011 at 2:43 PM, Avi Kivity<avi@redhat.com> wrote: >> You say that kvm-tool's scope is broader than Alex's script, therefore >> the latter is pointless. > > I'm saying that Alex's script is pointless because it's not attempting > to fix the real issues. For example, we're trying to make make it as > easy as possible to setup a guest and to be able to access guest data > from the host. Alex's script is essentially just a simplified QEMU > "front end" for kernel developers. > > That's why I feel it's a pointless thing to do. > > On Sun, Nov 6, 2011 at 2:43 PM, Avi Kivity<avi@redhat.com> wrote: >> You accept that qemu's scope is broader than kvm-tool (and is a >> superset). That is why many people think kvm-tool is pointless. > > Sure. I think it's mostly people that are interested in non-Linux > virtualization that think the KVM tool is a pointless project. > However, some people (including myself) think the KVM tool is a more > usable and hackable tool than QEMU for Linux virtualization. There are literally dozens of mini operating systems that exist for exactly the same reason that you describe above. They are smaller and easier to hack on than something like Linux. Regards, Anthony Liguori > > The difference here is that although I feel Alex's script is a > pointless project, I'm in no way opposed to merging it in the tree if > people use it and it solves their problem. Some people seem to be > violently opposed to merging the KVM tool and I'm having difficult > time understanding why that is. > > Pekka >
On Sun, 6 Nov 2011, Jan Kiszka wrote: >> Can you please share your kernel .config with me and I'll take a look >> at it. We now have a "make kvmconfig" makefile target for enabling all >> the necessary config options for guest kernels. I don't think any of >> us developers are using SUSE so it can surely be a KVM tool bug as >> well. > > Attached. It hang here as well. I ran make kvmconfig on your .config and it works. It's basically these two: @@ -1478,7 +1478,7 @@ CONFIG_NETPOLL=y # CONFIG_NETPOLL_TRAP is not set CONFIG_NET_POLL_CONTROLLER=y -CONFIG_VIRTIO_NET=m +CONFIG_VIRTIO_NET=y # CONFIG_VMXNET3 is not set # CONFIG_ISDN is not set # CONFIG_PHONE is not set @@ -1690,7 +1690,7 @@ # CONFIG_SERIAL_PCH_UART is not set # CONFIG_SERIAL_XILINX_PS_UART is not set CONFIG_HVC_DRIVER=y -CONFIG_VIRTIO_CONSOLE=m +CONFIG_VIRTIO_CONSOLE=y CONFIG_IPMI_HANDLER=m # CONFIG_IPMI_PANIC_EVENT is not set CONFIG_IPMI_DEVICE_INTERFACE=m Pekka
On 06.11.2011, at 05:06, Pekka Enberg wrote: > On Sun, Nov 6, 2011 at 2:43 PM, Avi Kivity <avi@redhat.com> wrote: >> You say that kvm-tool's scope is broader than Alex's script, therefore >> the latter is pointless. > > I'm saying that Alex's script is pointless because it's not attempting > to fix the real issues. For example, we're trying to make make it as > easy as possible to setup a guest and to be able to access guest data > from the host. Alex's script is essentially just a simplified QEMU > "front end" for kernel developers. > > That's why I feel it's a pointless thing to do. It's a script tailored to what Linus told me he wanted to see. I merely wanted to prove the point that what he wanted can be achieved without thousands and thousands of lines of code by reusing what is already there. IMHO less code is usually a good thing. In fact, why don't you just provide a script in tools/testing/ that fetches KVM Tool from a git tree somewhere else and compiles it? It could easily live outside the kernel tree - you can even grab our awesome "fetch all Linux headers" script from QEMU so you can keep in sync with KVM header files. At that point, both front ends would live in separate trees, could evolve however they like and everyone's happy, because KVM Tools would still be easy to use for people who want it by executing said shell script. > > On Sun, Nov 6, 2011 at 2:43 PM, Avi Kivity <avi@redhat.com> wrote: >> You accept that qemu's scope is broader than kvm-tool (and is a >> superset). That is why many people think kvm-tool is pointless. > > Sure. I think it's mostly people that are interested in non-Linux > virtualization that think the KVM tool is a pointless project. > However, some people (including myself) think the KVM tool is a more > usable and hackable tool than QEMU for Linux virtualization. Sure. That's taste. If I think that tcsh is a better shell than bash do I pull it into the kernel tree just so "it lies there"? It definitely does use kernel interfaces too, so I can make up just as many reasons as you to pull it in. > The difference here is that although I feel Alex's script is a > pointless project, I'm in no way opposed to merging it in the tree if > people use it and it solves their problem. Some people seem to be > violently opposed to merging the KVM tool and I'm having difficult > time understanding why that is. It's a matter of size and scope. Write a shell script that clones, builds and executes KVM Tool and throw it in testing/tools/ and I'll happily ack it! Alex
On 2011-11-06 18:11, Pekka Enberg wrote: > On Sun, 6 Nov 2011, Jan Kiszka wrote: >>> Can you please share your kernel .config with me and I'll take a look >>> at it. We now have a "make kvmconfig" makefile target for enabling all >>> the necessary config options for guest kernels. I don't think any of >>> us developers are using SUSE so it can surely be a KVM tool bug as >>> well. >> >> Attached. > > It hang here as well. I ran > > make kvmconfig > > on your .config and it works. It's basically these two: > > @@ -1478,7 +1478,7 @@ > CONFIG_NETPOLL=y > # CONFIG_NETPOLL_TRAP is not set > CONFIG_NET_POLL_CONTROLLER=y > -CONFIG_VIRTIO_NET=m > +CONFIG_VIRTIO_NET=y > # CONFIG_VMXNET3 is not set > # CONFIG_ISDN is not set > # CONFIG_PHONE is not set > @@ -1690,7 +1690,7 @@ > # CONFIG_SERIAL_PCH_UART is not set > # CONFIG_SERIAL_XILINX_PS_UART is not set > CONFIG_HVC_DRIVER=y > -CONFIG_VIRTIO_CONSOLE=m > +CONFIG_VIRTIO_CONSOLE=y > CONFIG_IPMI_HANDLER=m > # CONFIG_IPMI_PANIC_EVENT is not set > CONFIG_IPMI_DEVICE_INTERFACE=m > > Pekka Doesn't help here (with a disk image). Also, both dependencies make no sense to me as we boot from disk, not from net, and the console is on ttyS0. Jan
On Sun, Nov 6, 2011 at 7:15 PM, Alexander Graf <agraf@suse.de> wrote: >> The difference here is that although I feel Alex's script is a >> pointless project, I'm in no way opposed to merging it in the tree if >> people use it and it solves their problem. Some people seem to be >> violently opposed to merging the KVM tool and I'm having difficult >> time understanding why that is. > > It's a matter of size and scope. Write a shell script that clones, builds and > executes KVM Tool and throw it in testing/tools/ and I'll happily ack it! That's pretty much what git submodule would do, isn't it? I really don't see the point in doing that. We want to be part of regular kernel history and release cycle. We want people to be able to see what's going on in our tree to keep us honest and we want to make the barrier of entry as low as possible. It's not just about code, it's as much about culture and development process. Pekka
On 06.11.2011, at 09:28, Pekka Enberg wrote: > On Sun, Nov 6, 2011 at 7:15 PM, Alexander Graf <agraf@suse.de> wrote: >>> The difference here is that although I feel Alex's script is a >>> pointless project, I'm in no way opposed to merging it in the tree if >>> people use it and it solves their problem. Some people seem to be >>> violently opposed to merging the KVM tool and I'm having difficult >>> time understanding why that is. >> >> It's a matter of size and scope. Write a shell script that clones, builds and >> executes KVM Tool and throw it in testing/tools/ and I'll happily ack it! > > That's pretty much what git submodule would do, isn't it? > > I really don't see the point in doing that. We want to be part of > regular kernel history and release cycle. We want people to be able to > see what's going on in our tree to keep us honest and we want to make > the barrier of entry as low as possible. > > It's not just about code, it's as much about culture and development process. So you're saying that projects that are not living in the kernel tree aren't worthwhile? Or are you only trying to bump your oloh stats? I mean, seriously, git makes it so easy to have a separate tree that it almost doesn't make sense not to have one. You're constantly working in separate trees yourself because every one of your branches is separate. Keeping in sync with the kernel release cycles (which I don't think makes any sense for you) should be easy enough too by merely releasing in sync with the kernel tree... Alex
On Sun, 6 Nov 2011, Jan Kiszka wrote: > Doesn't help here (with a disk image). > > Also, both dependencies make no sense to me as we boot from disk, not > from net, and the console is on ttyS0. It's only VIRTIO_NET and the guest is not actually stuck, it just takes a while to boot: [ 1.866614] Installing 9P2000 support [ 1.868991] Registering the dns_resolver key type [ 1.878084] registered taskstats version 1 [ 13.927367] Root-NFS: no NFS server address [ 13.929500] VFS: Unable to mount root fs via NFS, trying floppy. [ 13.939177] VFS: Mounted root (9p filesystem) on device 0:12. [ 13.941522] devtmpfs: mounted [ 13.943317] Freeing unused kernel memory: 684k freed Mounting... Starting '/bin/sh'... sh-4.2# I'm CC'ing Sasha and Asias.
On Sun, Nov 6, 2011 at 7:30 PM, Alexander Graf <agraf@suse.de> wrote: >> That's pretty much what git submodule would do, isn't it? >> >> I really don't see the point in doing that. We want to be part of >> regular kernel history and release cycle. We want people to be able to >> see what's going on in our tree to keep us honest and we want to make >> the barrier of entry as low as possible. >> >> It's not just about code, it's as much about culture and development process. > > So you're saying that projects that are not living in the kernel tree aren't worthwhile? Yeah, that's exactly what I'm saying... > Or are you only trying to bump your oloh stats? That too! On Sun, Nov 6, 2011 at 7:30 PM, Alexander Graf <agraf@suse.de> wrote: > I mean, seriously, git makes it so easy to have a separate tree that > it almost doesn't make sense not to have one. You're constantly > working in separate trees yourself because every one of your > branches is separate. Keeping in sync with the kernel release cycles > (which I don't think makes any sense for you) should be easy enough > too by merely releasing in sync with the kernel tree... We'd be the only subsystem doing that! Why on earth do you think we want to be the first ones to do that? We don't want to be different, we want to make the barrier of entry low. Pekka
On Sun, Nov 6, 2011 at 7:08 PM, Anthony Liguori <anthony@codemonkey.ws> wrote: > I'm quite happy with KVM tool and hope they continue working on it. My only > real wish is that they wouldn't copy QEMU so much and would try bolder > things that are fundamentally different from QEMU. Hey, right now our only source of crazy ideas is Ingo and I think he's actually a pretty conservative guy when it comes to technology. Avi has expressed some crazy ideas in the past but they require switching away from C and that's not something we're interested in doing. ;-) Pekka
On Sun, Nov 06, 2011 at 11:08:10AM -0600, Anthony Liguori wrote: > I'm quite happy with KVM tool and hope they continue working on it. > My only real wish is that they wouldn't copy QEMU so much and would > try bolder things that are fundamentally different from QEMU. My big wish is that they don't try to merge the KVM tool into the kernel code. It's a separate userspace project, and there's no reason for it to be bundled with kernel code. It just makes the kernel sources larger. The mere fact that qemu-kvm exists means that the KVM interface has to remain backward compatible; it *is* an ABI. So integrating kvm-tool into the kernel isn't going to work as a free pass to make non-backwards compatible changes to the KVM user/kernel interface. Given that, why bloat the kernel source tree size? Please, keep the kvm-tool sources as a separate git tree. - Ted
On Sun, Nov 06, 2011 at 11:08:10AM -0600, Anthony Liguori wrote: >> I'm quite happy with KVM tool and hope they continue working on it. >> My only real wish is that they wouldn't copy QEMU so much and would >> try bolder things that are fundamentally different from QEMU. On Sun, Nov 6, 2011 at 8:31 PM, Ted Ts'o <tytso@mit.edu> wrote: > My big wish is that they don't try to merge the KVM tool into the > kernel code. It's a separate userspace project, and there's no reason > for it to be bundled with kernel code. It just makes the kernel > sources larger. The mere fact that qemu-kvm exists means that the KVM > interface has to remain backward compatible; it *is* an ABI. > > So integrating kvm-tool into the kernel isn't going to work as a free > pass to make non-backwards compatible changes to the KVM user/kernel > interface. Given that, why bloat the kernel source tree size? Ted, I'm confused. Making backwards incompatible ABI changes has never been on the table. Why are you bringing it up? Pekka
On Sun, Nov 6, 2011 at 8:54 PM, Pekka Enberg <penberg@kernel.org> wrote: >> So integrating kvm-tool into the kernel isn't going to work as a free >> pass to make non-backwards compatible changes to the KVM user/kernel >> interface. Given that, why bloat the kernel source tree size? > > Ted, I'm confused. Making backwards incompatible ABI changes has never > been on the table. Why are you bringing it up? And btw, KVM tool is not a random userspace project - it was designed to live in tools/kvm from the beginning. I've explained the technical rationale for sharing kernel code here: https://lkml.org/lkml/2011/11/4/150 Please also see Ingo's original rant that started the project: http://thread.gmane.org/gmane.linux.kernel/962051/focus=962620 Pekka
On 11/06/2011 06:28 PM, Pekka Enberg wrote: > On Sun, Nov 6, 2011 at 7:15 PM, Alexander Graf<agraf@suse.de> wrote: >>> The difference here is that although I feel Alex's script is a >>> pointless project, I'm in no way opposed to merging it in the tree if >>> people use it and it solves their problem. Some people seem to be >>> violently opposed to merging the KVM tool and I'm having difficult >>> time understanding why that is. >> >> It's a matter of size and scope. Write a shell script that clones, builds and >> executes KVM Tool and throw it in testing/tools/ and I'll happily ack it! > > That's pretty much what git submodule would do, isn't it? Absolutely not. It would always fetch HEAD from the KVM tool repo. A submodule ties each supermodule commit to a particular submodule commit. > I really don't see the point in doing that. We want to be part of > regular kernel history and release cycle. But I'm pretty certain that, when testing 3.2 with KVM tool in a couple of years, I want all the shining new features you added in this time; I don't want the old end-2011 code. Same if I'm bisecting kernels, I don't want to build KVM tool once per bisection cycle, do I? Paolo
On 11/06/2011 07:05 PM, Pekka Enberg wrote: >> I mean, seriously, git makes it so easy to have a separate tree that >> > it almost doesn't make sense not to have one. You're constantly >> > working in separate trees yourself because every one of your >> > branches is separate. Keeping in sync with the kernel release cycles >> > (which I don't think makes any sense for you) should be easy enough >> > too by merely releasing in sync with the kernel tree... > We'd be the only subsystem doing that! GStreamer (V4L), RTSAdmin (LIO target), sg3_utils, trousers all are out of tree, and nobody of their authors is even thinking of doing all this brouhaha to get merged into Linus's tree. Paolo
On Sun, Nov 6, 2011 at 9:11 PM, Paolo Bonzini <pbonzini@redhat.com> wrote: >> I really don't see the point in doing that. We want to be part of >> regular kernel history and release cycle. > > But I'm pretty certain that, when testing 3.2 with KVM tool in a couple of > years, I want all the shining new features you added in this time; I don't > want the old end-2011 code. Same if I'm bisecting kernels, I don't want to > build KVM tool once per bisection cycle, do I? If you're bisecting breakage that can be in the guest kernel or the KVM tool, you'd want to build both. What would prevent you from using a newer KVM tool with an older kernel?
On Sun, Nov 6, 2011 at 9:14 PM, Paolo Bonzini <pbonzini@redhat.com> wrote: > GStreamer (V4L), RTSAdmin (LIO target), sg3_utils, trousers all are out of > tree, and nobody of their authors is even thinking of doing all this > brouhaha to get merged into Linus's tree. We'd be the first subsystem to use the download script thing Alex suggested.
On 11/06/2011 08:17 PM, Pekka Enberg wrote: >> > But I'm pretty certain that, when testing 3.2 with KVM tool in a couple of >> > years, I want all the shining new features you added in this time; I don't >> > want the old end-2011 code. Same if I'm bisecting kernels, I don't want to >> > build KVM tool once per bisection cycle, do I? > > If you're bisecting breakage that can be in the guest kernel or the > KVM tool, you'd want to build both. No. I want to try new tool/old kernel and old tool/new kernel (kernel can be either guest or host, depending on the nature of the bug), and then bisect just one. (*) And that's the exceptional case, and only KVM tool developers really should have the need to do that. (*) Not coincidentially, that's what git bisect does when HEAD is a merge of two unrelated histories. > What would prevent you from using a newer KVM tool with an older kernel? Nothing, but I'm just giving you *strong* hints that a submodule or a merged tool is the wrong solution, and the histories of kernel and tool should be kept separate. More clearly: for its supposedly intended usage, namely testing development kernels in a *guest*, KVM tool will generally not run on the exact *host* kernel that is in the tree it lives with. Almost never, in fact. Unlike perf, if you want to test multiple guest kernels you should never need to rebuild KVM tool! This is the main argument as to whether or not to merge the tool. Would the integration of the *build* make sense or not? Assume you adapt the ktest script to make both the KVM tool and the kernel, and test the latter using the former. Your host kernel never changes, and yet you introduce a new variable in your testing. That complicates things, it doesn't simplify them. Paolo
On Sun, Nov 6, 2011 at 10:01 PM, Paolo Bonzini <pbonzini@redhat.com> wrote: >> If you're bisecting breakage that can be in the guest kernel or the >> KVM tool, you'd want to build both. > > No. I want to try new tool/old kernel and old tool/new kernel (kernel can > be either guest or host, depending on the nature of the bug), and then > bisect just one. (*) And that's the exceptional case, and only KVM tool > developers really should have the need to do that. Exactly - having the source code in Linux kernel tree covers the "exceptional case" where we're unsure which part of the equation broke things (which are btw the nasties issues we've had so far). I have no idea why you're trying to convince me that it doesn't matter. You can bisect only one of the components in isolation just fine. On Sun, Nov 6, 2011 at 10:01 PM, Paolo Bonzini <pbonzini@redhat.com> wrote: >> What would prevent you from using a newer KVM tool with an older kernel? > > Nothing, but I'm just giving you *strong* hints that a submodule or a merged > tool is the wrong solution, and the histories of kernel and tool should be > kept separate. > > More clearly: for its supposedly intended usage, namely testing development > kernels in a *guest*, KVM tool will generally not run on the exact *host* > kernel that is in the tree it lives with. Almost never, in fact. Unlike > perf, if you want to test multiple guest kernels you should never need to > rebuild KVM tool! > > This is the main argument as to whether or not to merge the tool. Would the > integration of the *build* make sense or not? Assume you adapt the ktest > script to make both the KVM tool and the kernel, and test the latter using > the former. Your host kernel never changes, and yet you introduce a new > variable in your testing. That complicates things, it doesn't simplify > them. I don't understand what trying to say. There's no requirement to build the KVM tool if you're bisecting a guest kernel. Pekka
On Sun, Nov 6, 2011 at 10:01 PM, Paolo Bonzini <pbonzini@redhat.com> wrote: > Nothing, but I'm just giving you *strong* hints that a submodule or a merged > tool is the wrong solution, and the histories of kernel and tool should be > kept separate. And btw, I don't really understand what you're trying to accomplish with this line of reasoning. We've tried both separate and shared repository and the latter is much better from development point of view. This is not some random userspace project that uses the kernel system calls. It's a hypervisor that implements virtio drivers, serial emulation, and mini-BIOS. It's very close to the kernel which is why it's such a good fit with the kernel tree. I'd actually be willing to argue that from purely technical point of view, KVM tool makes much more sense to have in the kernel tree than perf does. Pekka
$ <CAOJsxLFCjkAK7Lw4M15G44k11zrcF7tnu9yMbiQYDBNZr+83tg@mail.gmail.com> From: fche@redhat.com (Frank Ch. Eigler) Date: Sun, 06 Nov 2011 17:08:48 -0500 In-Reply-To: <CAOJsxLFCjkAK7Lw4M15G44k11zrcF7tnu9yMbiQYDBNZr+83tg@mail.gmail.com> (Pekka Enberg's message of "Sun, 6 Nov 2011 20:05:45 +0200") Message-ID: <y0mhb2g6gzz.fsf@fche.csb> User-Agent: Gnus/5.1008 (Gnus v5.10.8) Emacs/21.4 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Pekka Enberg <penberg@kernel.org> writes: > [...] We don't want to be different, we want to make the barrier of > entry low. When has the barrier of entry into the kernel ever been "low" for anyone not already working in the kernel? - FChE
On Sun, Nov 06, 2011 at 08:58:20PM +0200, Pekka Enberg wrote: > > Ted, I'm confused. Making backwards incompatible ABI changes has never > > been on the table. Why are you bringing it up? > > And btw, KVM tool is not a random userspace project - it was designed > to live in tools/kvm from the beginning. I've explained the technical > rationale for sharing kernel code here: > > https://lkml.org/lkml/2011/11/4/150 > > Please also see Ingo's original rant that started the project: > > http://thread.gmane.org/gmane.linux.kernel/962051/focus=962620 Because I don't buy any of these arguments. We have the same kernel developers working on xfs and xfsprogs, ext4 and e2fsprogs, btrfs and btrfsprogs, and we don't have those userspace projects in the kernel source tree. The only excuse I can see is a hope to make random changes to the kernel and userspace tools without having to worry about compatibility problems, which is an argument I've seen with perf (that you have to use the same version of perf as the kernel version, which to me is bad software engineering). And that's why I pointed out that you can't do that with KVM, since we have out-of-tree userspace users, namely qemu-kvm. The rest of the arguments are arguments for a new effort, which is fine --- but not an excuse for putting in the kernel source tree. - Ted
On 11/06/2011 12:09 PM, Pekka Enberg wrote: > On Sun, Nov 6, 2011 at 7:08 PM, Anthony Liguori<anthony@codemonkey.ws> wrote: >> I'm quite happy with KVM tool and hope they continue working on it. My only >> real wish is that they wouldn't copy QEMU so much and would try bolder >> things that are fundamentally different from QEMU. > > Hey, right now our only source of crazy ideas is Ingo and I think he's > actually a pretty conservative guy when it comes to technology. Avi > has expressed some crazy ideas in the past but they require switching > away from C and that's not something we're interested in doing. ;-) Just a couple random suggestions: - Drop SDL/VNC. Make a proper Cairo GUI with a full blown GTK interface. Don't rely on virt-manager for this. Not that I have anything against virt-manager but there are many layers between you and the end GUI if you go that route. - Sandbox the device model from day #1. The size of the Linux kernel interface is pretty huge and as a hypervisor, it's the biggest place for improvement from a security perspective. We're going to do sandboxing in QEMU, but it's going to be difficult. It would be much easier for you given where you're at. Regards, Anthony Liguori > > Pekka >
On Sun, 6 Nov 2011, Ted Ts'o wrote: > The only excuse I can see is a hope to make random changes to the > kernel and userspace tools without having to worry about compatibility > problems, which is an argument I've seen with perf (that you have to > use the same version of perf as the kernel version, which to me is bad > software engineering). And that's why I pointed out that you can't do > that with KVM, since we have out-of-tree userspace users, namely > qemu-kvm. I've never heard ABI incompatibility used as an argument for perf. Ingo? As for the KVM tool, merging has never been about being able to do ABI incompatible changes and never will be. I'm still surprised you even brought this up because I've always been one to _complain_ about people breaking the ABI - not actually breaking it (at least on purpose). Pekka
Hi Anthony, On Sun, 6 Nov 2011, Anthony Liguori wrote: > - Drop SDL/VNC. Make a proper Cairo GUI with a full blown GTK interface. > Don't rely on virt-manager for this. Not that I have anything against > virt-manager but there are many layers between you and the end GUI if you go > that route. Funny that you should mention this. It was actually what I started out with. I went for SDL because it was a low-hanging fruit after the VNC patches which I didn't do myself. However, it was never figured out if there was going to be a virtio transport for GPU commands: http://lwn.net/Articles/408831/ On Sun, 6 Nov 2011, Anthony Liguori wrote: > - Sandbox the device model from day #1. The size of the Linux kernel > interface is pretty huge and as a hypervisor, it's the biggest place for > improvement from a security perspective. We're going to do sandboxing in > QEMU, but it's going to be difficult. It would be much easier for you given > where you're at. Completely agreed. I think Sasha is actually starting to work on this. See the "Secure KVM" thread on kvm@. Pekka
On Mon, Nov 7, 2011 at 12:08 AM, Frank Ch. Eigler <fche@redhat.com> wrote: >> [...] We don't want to be different, we want to make the barrier of >> entry low. > > When has the barrier of entry into the kernel ever been "low" > for anyone not already working in the kernel? What's your point? Working on the KVM tool requires knowledge of the Linux kernel. Pekka
On 11/06/2011 09:17 PM, Pekka Enberg wrote: > > No. I want to try new tool/old kernel and old tool/new kernel (kernel can > > be either guest or host, depending on the nature of the bug), and then > > bisect just one. (*) And that's the exceptional case, and only KVM tool > > developers really should have the need to do that. > > Exactly - having the source code in Linux kernel tree covers the > "exceptional case" where we're unsure which part of the equation broke > things (which are btw the nasties issues we've had so far). No, having the source code in Linux kernel tree is perfectly useless for the exceptional case, and forces you to go through extra hoops to build only one component. Small hoops such as adding "-- tools/kvm" to "git bisect start" perhaps, but still hoops that aren't traded for a practical advantage. You keep saying "oh things have been so much better" because "it's so close to the kernel" and "it worked so great for perf", but you haven't brought any practical example that we can stare at in admiration. (BTW, I'm also convinced like Ted that not having a defined perf ABI might have made sense in the beginning, but it has now devolved into bad software engineering practice). > I have no idea why you're trying to convince me that it doesn't matter. I'm not trying to convince you that it doesn't matter, I'm trying to convince you that it doesn't *make sense*. > It's a hypervisor that implements virtio drivers, serial > emulation, and mini-BIOS. ... all of which have a spec against which you should be working. Save perhaps for the mini-BIOS, if you develop against the kernel source rather than the spec you're doing it *wrong*. Very wrong. But you've been told this many times already. Paolo
On Mon, Nov 7, 2011 at 10:00 AM, Paolo Bonzini <pbonzini@redhat.com> wrote: > No, having the source code in Linux kernel tree is perfectly useless for the > exceptional case, and forces you to go through extra hoops to build only one > component. Small hoops such as adding "-- tools/kvm" to "git bisect start" > perhaps, but still hoops that aren't traded for a practical advantage. You > keep saying "oh things have been so much better" because "it's so close to > the kernel" and "it worked so great for perf", but you haven't brought any > practical example that we can stare at in admiration. The _practical example_ is the working software in tools/kvm! >> I have no idea why you're trying to convince me that it doesn't matter. > > I'm not trying to convince you that it doesn't matter, I'm trying to > convince you that it doesn't *make sense*. > >> It's a hypervisor that implements virtio drivers, serial >> emulation, and mini-BIOS. > > ... all of which have a spec against which you should be working. Save > perhaps for the mini-BIOS, if you develop against the kernel source rather > than the spec you're doing it *wrong*. Very wrong. But you've been told > this many times already. I have zero interest in arguing with you about something you have no practical experience on. I've tried both out-of-tree and in-tree development for the KVM tool and I can tell you the latter is much more productive environment. We are obviously also using specifications but as you damn well should know, specifications don't matter nearly as much as working code. That's why it's important to have easy access to both. Pekka
On Mon, Nov 7, 2011 at 10:00 AM, Paolo Bonzini <pbonzini@redhat.com> wrote: > (BTW, I'm also convinced like Ted that not having a defined perf ABI might > have made sense in the beginning, but it has now devolved into bad software > engineering practice). I'm not a perf maintainer so I don't know what the situation with wrt. ABI breakage is. Your or Ted's comments don't match my assumptions or experience, though. Pekka
On 11/07/2011 09:09 AM, Pekka Enberg wrote: > We are obviously also using specifications but as you damn well should > know, specifications don't matter nearly as much as working code. Specifications matter much more than working code. Quirks are a fact of life but should always come second. To bring you an example from the kernel, there is a very boring list of "PCI quirks" and a lot of code for "PCI specs", not the other way round. Paolo
On 11/07/2011 09:09 AM, Pekka Enberg wrote: >> We are obviously also using specifications but as you damn well should >> know, specifications don't matter nearly as much as working code. On Mon, 7 Nov 2011, Paolo Bonzini wrote: > Specifications matter much more than working code. Quirks are a fact of life > but should always come second. To quote Linus: And I have seen _lots_ of total crap work that was based on specs. It's _the_ single worst way to write software, because it by definition means that the software was written to match theory, not reality. [ http://kerneltrap.org/node/5725 ] So no, I don't agree with you at all. Pekka
On 11/07/2011 09:45 AM, Pekka Enberg wrote: > >> Specifications matter much more than working code. Quirks are a fact >> of life but should always come second. > > To quote Linus: > > And I have seen _lots_ of total crap work that was based on specs. It's > _the_ single worst way to write software, because it by definition means > that the software was written to match theory, not reality. All generalizations are false. Paolo
On 11/07/2011 09:45 AM, Pekka Enberg wrote: >>> Specifications matter much more than working code. Quirks are a fact >>> of life but should always come second. >> >> To quote Linus: >> >> And I have seen _lots_ of total crap work that was based on specs. It's >> _the_ single worst way to write software, because it by definition means >> that the software was written to match theory, not reality. On Mon, Nov 7, 2011 at 10:52 AM, Paolo Bonzini <pbonzini@redhat.com> wrote: > All generalizations are false. What is that supposed to mean? You claimed we're "doing it wrong" and I explained you why we are doing it the way we are. Really, the way we do things in the KVM tool is not a bug, it's a feature. Pekka
Hi, > "Usable" - I've tried kvm-tool several times and still (today) fail to > get a standard SUSE image (with a kernel I have to compile and provide > separately...) up and running *). Likely a user mistake, but none that > is very obvious. At least to me. Same here. No support for booting from CDROM. No support for booting from Network. Thus no way to install a new guest image. Booting an existing qcow2 guest image failed, the guest started throwing I/O errors. And even to try that I had to manually extract the kernel and initrd images from the guest. Maybe you should check with the Xen guys, they have a funky 'pygrub' which sort-of automates the copy-kernel-from-guest-image process. Booting the host kernel failed too. Standard distro kernel. The virtio bits are modular, not statically compiled into the kernel. kvm tool can't handle that. You have to build your own kernel and make sure you flip the correct config bits, then you can boot it to a shell prompt. Trying anything else just doesn't work today ... cheers, Gerd
On Mon, Nov 7, 2011 at 12:11 PM, Gerd Hoffmann <kraxel@redhat.com> wrote: > No support for booting from CDROM. > No support for booting from Network. > Thus no way to install a new guest image. Sure. It's a pain point which we need to fix. On Mon, Nov 7, 2011 at 12:11 PM, Gerd Hoffmann <kraxel@redhat.com> wrote: > Booting an existing qcow2 guest image failed, the guest started throwing > I/O errors. And even to try that I had to manually extract the kernel > and initrd images from the guest. Maybe you should check with the Xen > guys, they have a funky 'pygrub' which sort-of automates the > copy-kernel-from-guest-image process. QCOW2 support is experimental. The I/O errors are caused by forced read-only mode. On Mon, Nov 7, 2011 at 12:11 PM, Gerd Hoffmann <kraxel@redhat.com> wrote: > Booting the host kernel failed too. Standard distro kernel. The virtio > bits are modular, not statically compiled into the kernel. kvm tool > can't handle that. I think we have some support for booting modular distro kernels too if you tell KVM tool where to find initrd. It sucks out-of-the-box though because nobody seems to be using it. On Mon, Nov 7, 2011 at 12:11 PM, Gerd Hoffmann <kraxel@redhat.com> wrote: > You have to build your own kernel and make sure you flip the correct > config bits, then you can boot it to a shell prompt. Trying anything > else just doesn't work today ... What can I say? Patches welcome? :-) Pekka
Hi,
> It's not just about code, it's as much about culture and development process.
Indeed. The BSDs have both kernel and the base system in a single
repository. There are probably good reasons for (and against) it.
In Linux we don't have that culture. No tool (except perf) lives in the
kernel repo. I fail to see why kvm-tool is that much different from
udev, util-linux, iproute, filesystem tools, that it should be included.
cheers,
Gerd
On Mon, Nov 7, 2011 at 12:23 PM, Gerd Hoffmann <kraxel@redhat.com> wrote: > Hi, > >> It's not just about code, it's as much about culture and development process. > > Indeed. The BSDs have both kernel and the base system in a single > repository. There are probably good reasons for (and against) it. > > In Linux we don't have that culture. No tool (except perf) lives in the > kernel repo. I fail to see why kvm-tool is that much different from > udev, util-linux, iproute, filesystem tools, that it should be included. tools/power was merged in just 2 versions ago, do you think that merging that was a mistake?
Am 06.11.2011 19:31, schrieb Ted Ts'o: > On Sun, Nov 06, 2011 at 11:08:10AM -0600, Anthony Liguori wrote: >> I'm quite happy with KVM tool and hope they continue working on it. >> My only real wish is that they wouldn't copy QEMU so much and would >> try bolder things that are fundamentally different from QEMU. > > My big wish is that they don't try to merge the KVM tool into the > kernel code. It's a separate userspace project, and there's no reason > for it to be bundled with kernel code. It just makes the kernel > sources larger. In fact, the reverse is true as well: It makes kvm-tool's sources larger. Instead on just cloning a small repository I need to clone the whole kernel repository, even though I'm not a kernel developer and don't intend to touch anything but tools/kvm. Not too bad for me as I have a kernel repository lying around anyway and I can share most of the content, but there are people who don't. Still, having an additional 1.2 GB repository just for ~1 MB in which I'm really interested doesn't make me too happy. And dealing with a huge repository also means that even git becomes slower (which means, I had to turn off some functionality for my shell prompt in this repo, as I didn't like waiting for much more than a second or two) Makes it a lot less hackable for me unless you want to restrict the set of potential developers to Linux kernel developers... Kevin
On 11/07/2011 11:30 AM, Sasha Levin wrote: > > In Linux we don't have that culture. No tool (except perf) lives in the > > kernel repo. I fail to see why kvm-tool is that much different from > > udev, util-linux, iproute, filesystem tools, that it should be included. > > tools/power was merged in just 2 versions ago, do you think that > merging that was a mistake? Indeed I do not see any advantage, since all the interfaces they use are stable anyway (sysfs, msr.ko). If they had gone in x86info, for example, my distro (F16, not exactly conservative) would have likely picked those tools up already, but it didn't. Paolo
On Mon, 7 Nov 2011, Gerd Hoffmann wrote: >> It's not just about code, it's as much about culture and development process. > > Indeed. The BSDs have both kernel and the base system in a single > repository. There are probably good reasons for (and against) it. > > In Linux we don't have that culture. No tool (except perf) lives in the > kernel repo. I fail to see why kvm-tool is that much different from > udev, util-linux, iproute, filesystem tools, that it should be included. You seem to think perf is an exception - I think it's going to be the future norm for userspace components that are very close to the kernel. That's in fact what Ingo was arguing for when he suggested QEMU to be merged to the kernel tree. Pekka
On Mon, 7 Nov 2011, Kevin Wolf wrote: > Makes it a lot less hackable for me unless you want to restrict the set > of potential developers to Linux kernel developers... We're not restricting potential developers to Linux kernel folks. We're making it easy for them because we believe that the KVM tool is a userspace component that requires the kind of low-level knowledge Linux kernel developers have. I think you're looking at the KVM tool with your QEMU glasses on without realizing that there's no point in comparing the two: we only support Linux on Linux and we avoid hardware emulation as much as possible. So what makes sense for QEMU, doesn't necessarily translate to the KVM tool project. Pekka
On Mon, Nov 7, 2011 at 1:02 PM, Paolo Bonzini <pbonzini@redhat.com> wrote: > Indeed I do not see any advantage, since all the interfaces they use are > stable anyway (sysfs, msr.ko). > > If they had gone in x86info, for example, my distro (F16, not exactly > conservative) would have likely picked those tools up already, but it > didn't. Distributing userspace tools in the kernel tree is a relatively new concept so it's not at all surprising distributions don't pick them up as quickly. That doesn't mean it's a fundamentally flawed approach, though. Also, I'm mostly interested in defending the KVM tool, so I'd prefer not to argue whether or not carrying userspace code in the kernel tree makes sense or not. The fact is that Linux is already doing it and I think the only relevant question is whether or not the KVM tool qualifies. I obviously think the answer is yes. Pekka
* Pekka Enberg <penberg@cs.helsinki.fi> wrote: > On Mon, 7 Nov 2011, Gerd Hoffmann wrote: > >>It's not just about code, it's as much about culture and development process. > > > >Indeed. The BSDs have both kernel and the base system in a single > >repository. There are probably good reasons for (and against) it. > > > > In Linux we don't have that culture. No tool (except perf) lives > > in the kernel repo. I fail to see why kvm-tool is that much > > different from udev, util-linux, iproute, filesystem tools, that > > it should be included. > > You seem to think perf is an exception - I think it's going to be > the future norm for userspace components that are very close to the > kernel. That's in fact what Ingo was arguing for when he suggested > QEMU to be merged to the kernel tree. Yep, and the answer i got from the Qemu folks when i suggested that merge was a polite "buzz off", along the lines of: "We don't want to do that, but feel free to write your own tool, leave Qemu alone." Now that people have done exactly that some Qemu folks not only have changed their objection from "write your own tool" to "erm, write your own tool but do it the way *we* prefer you to do it" - they also started contributing *against* the KVM tool with predictable, once every 3 months objections against its upstream merge... That's not very nice and not very constructive. The only valid technical objection against tools/kvm/ that i can see would be that it's not useful enough yet for the upstream kernel versus other tools such as Qemu. In all fairness i think we might still be at that early stage of the project but it's clearly progressing very rapidly and i'm already using it on a daily basis for my own kernel testing purposes. During the Kernel Summit that's how i tested contemporary kernels on contemporary user-space remotely, without having to risk a physical reboot. Thanks, Ingo
Am 07.11.2011 12:38, schrieb Pekka Enberg: > On Mon, 7 Nov 2011, Kevin Wolf wrote: >> Makes it a lot less hackable for me unless you want to restrict the set >> of potential developers to Linux kernel developers... > > We're not restricting potential developers to Linux kernel folks. We're > making it easy for them because we believe that the KVM tool is a > userspace component that requires the kind of low-level knowledge Linux > kernel developers have. > > I think you're looking at the KVM tool with your QEMU glasses on without > realizing that there's no point in comparing the two: we only support > Linux on Linux and we avoid hardware emulation as much as possible. So > what makes sense for QEMU, doesn't necessarily translate to the KVM tool > project. I'm not comparing anything. I'm not even referring to the virtualization functionality of it. It could be doing anything else and it wouldn't make a difference. For KVM tool I am not much more than a mere user. Trying it out was tedious for me, as it is for anyone else who isn't a kernel developer. That's all I'm saying. Making things easier for some kernel developers but ignoring that at the same time it makes things harder for users I consider a not so clever move. Just wanted to point that out; feel free to ignore it, your priorities are probably different. Kevin
On 11/07/11 12:34, Pekka Enberg wrote: > On Mon, 7 Nov 2011, Gerd Hoffmann wrote: >>> It's not just about code, it's as much about culture and development >>> process. >> >> Indeed. The BSDs have both kernel and the base system in a single >> repository. There are probably good reasons for (and against) it. >> >> In Linux we don't have that culture. No tool (except perf) lives in the >> kernel repo. I fail to see why kvm-tool is that much different from >> udev, util-linux, iproute, filesystem tools, that it should be included. > > You seem to think perf is an exception - I think it's going to be the > future norm for userspace components that are very close to the kernel. perf *is* an exception today. It might make sense to change that. But IMHO it only makes sense if there is a really broad agreement on it and other core stuff moves into the kernel too. Then you'll be able to get advantages out of it. For example standardizing the process to create an initramfs (using the userspace tools shipped with the kernel) instead of having each distro creating its own way. I somehow doubt we'll see such an broad agreement though. Most people seem to be happy with the current model. There is a reason why the klibc + early-userspace-in-kernel-tree project died in the end ... cheers, Gerd
On 11/07/11 12:44, Pekka Enberg wrote: > On Mon, Nov 7, 2011 at 1:02 PM, Paolo Bonzini <pbonzini@redhat.com> wrote: >> Indeed I do not see any advantage, since all the interfaces they use are >> stable anyway (sysfs, msr.ko). >> >> If they had gone in x86info, for example, my distro (F16, not exactly >> conservative) would have likely picked those tools up already, but it >> didn't. > > Distributing userspace tools in the kernel tree is a relatively new > concept so it's not at all surprising distributions don't pick them up > as quickly. That doesn't mean it's a fundamentally flawed approach, > though. tools/ lacks a separation into "kernel hacker's testing+debugging toolbox" and "userspace tools". It lacks proper buildsystem integration for the userspace tools, there is no "make tools" and also no "make tools_install". Silently dropping new stuff into tools/ and expecting the world magically noticing isn't going to work. cheers, Gerd
On Mon, Nov 7, 2011 at 2:18 PM, Gerd Hoffmann <kraxel@redhat.com> wrote: > tools/ lacks a separation into "kernel hacker's testing+debugging > toolbox" and "userspace tools". It lacks proper buildsystem integration > for the userspace tools, there is no "make tools" and also no "make > tools_install". Silently dropping new stuff into tools/ and expecting > the world magically noticing isn't going to work. No disagreement here. Pekka
On 11/07/2011 12:30 PM, Sasha Levin wrote: > On Mon, Nov 7, 2011 at 12:23 PM, Gerd Hoffmann <kraxel@redhat.com> wrote: > > Hi, > > > >> It's not just about code, it's as much about culture and development process. > > > > Indeed. The BSDs have both kernel and the base system in a single > > repository. There are probably good reasons for (and against) it. > > > > In Linux we don't have that culture. No tool (except perf) lives in the > > kernel repo. I fail to see why kvm-tool is that much different from > > udev, util-linux, iproute, filesystem tools, that it should be included. > > tools/power was merged in just 2 versions ago, do you think that > merging that was a mistake? Things like tools/power may make sense, most of the code is tied to the kernel interfaces. tools/kvm is 20k lines and is likely to be 40k+ lines or more before it is generally usable. The proportion of the code that talks to the kernel is quite small.
On Mon, Nov 07, 2011 at 01:08:50PM +0100, Gerd Hoffmann wrote: > > perf *is* an exception today. > > It might make sense to change that. But IMHO it only makes sense if > there is a really broad agreement on it and other core stuff moves into > the kernel too. Then you'll be able to get advantages out of it. For > example standardizing the process to create an initramfs (using the > userspace tools shipped with the kernel) instead of having each distro > creating its own way. I wish distributions had standardized on a single initramfs, sure. But that doesn't mean that the only way to do this is to merge userspace code into the kernel source tree. Everybody uses fsck, originally from the e2fsprogs source tree, and now from util-linux-ng, and that isn't merged into the kernel sources. And I think would be actively *harmful* to merge util-linux-ng into the kernel sources. For a variety of reasons, you may want to upgrade util-linux-ng, and not the kernel, or the kernel, and not util-linux-ng. If you package the two sources together, it becomes unclear what versions of the kernel will work with which versions of util-linux-ng, and vice versa. Suppose you need to fix a security bug in some program that lives in util-linux-ng. If it was bundled inside the kernel, a distribution would now have to release a kernel source package. Does that mean that it will have to ship the a new set of kernel binaries? Or does the distribution have to ship multiple binary packages that derive from the differently versioned source packages? And the same problems will exist with kvm-tool. What if you need to release a new version of kvm-tool? Does that mean that you have to release a new set of kernel binaries? It's a mess, and there's a reason why we don't have glibc, e2fsprogs, xfsprogs, util-linux-ng, etc., all packaged into the kernel sources. Because it's a stupid, idiotic thing to do. - Ted
Hi Avi, On Mon, Nov 7, 2011 at 2:26 PM, Avi Kivity <avi@redhat.com> wrote: >> tools/power was merged in just 2 versions ago, do you think that >> merging that was a mistake? > > Things like tools/power may make sense, most of the code is tied to the > kernel interfaces. tools/kvm is 20k lines and is likely to be 40k+ > lines or more before it is generally usable. The proportion of the code > that talks to the kernel is quite small. So what do you think about perf then? The amount of code that talks to the kernel is much smaller than that of the KVM tool. Pekka
Hi Ted, On Mon, Nov 7, 2011 at 2:29 PM, Ted Ts'o <tytso@mit.edu> wrote: > And the same problems will exist with kvm-tool. What if you need to > release a new version of kvm-tool? Does that mean that you have to > release a new set of kernel binaries? It's a mess, and there's a > reason why we don't have glibc, e2fsprogs, xfsprogs, util-linux-ng, > etc., all packaged into the kernel sources. If we need to release a new version, patches would go through the -stable tree just like with any other subsystem. On Mon, Nov 7, 2011 at 2:29 PM, Ted Ts'o <tytso@mit.edu> wrote: > Because it's a stupid, idiotic thing to do. The discussion is turning into whether or not linux/tools makes sense or not. I wish you guys would have had it before perf was merged to the tree. Pekka
On Mon, Nov 07, 2011 at 02:29:45PM +0200, Pekka Enberg wrote: > So what do you think about perf then? The amount of code that talks to > the kernel is much smaller than that of the KVM tool. I think it's a mess, because it's never clear whether perf needs to be upgraded when I upgrade the kernel, or vice versa. This is why I keep harping on the interface issues. Fortunately it seems less likely (since perf doesn't run with privileges) that security fixes will need to be released for perf, but if it did, given the typical regression testing requirements that many distributions have, and given that most distro packaging tools assume that all binaries from a single source package come from a single version of that source package, I predict you will hear screams from the distro release engineers. And by the way, there are use cases, where the guest OS kernel and root on the guest OS are not available to the untrusted users, where the userspace KVM program would be part of the security perimeter, and were security releases to the KVM part of the tool might very well be necessary, and it would be unfortunate if that forced the release of new kernel packages each time security fixes are needed to the kvm-tool userspace. Might kvm-tool be more secure than qemu? Quite possibly, given that it's going to do less than qemu. But please note that I've not been arguing that kvm-tool shouldn't be done; just that it not be included in the kernel sources. Just as sparse is not bundled into the kernel sources, for crying out loud! - Ted
On 11/07/2011 02:29 PM, Pekka Enberg wrote: > Hi Avi, > > On Mon, Nov 7, 2011 at 2:26 PM, Avi Kivity <avi@redhat.com> wrote: > >> tools/power was merged in just 2 versions ago, do you think that > >> merging that was a mistake? > > > > Things like tools/power may make sense, most of the code is tied to the > > kernel interfaces. tools/kvm is 20k lines and is likely to be 40k+ > > lines or more before it is generally usable. The proportion of the code > > that talks to the kernel is quite small. > > So what do you think about perf then? The amount of code that talks to > the kernel is much smaller than that of the KVM tool. Maybe it's outgrown the kernel repo too. Certainly something that has perl and python integration, a TUI, and one day hopefully a GUI, doesn't really need the kernel sources.
On Mon, Nov 07, 2011 at 02:42:57PM +0200, Pekka Enberg wrote: > On Mon, Nov 7, 2011 at 2:29 PM, Ted Ts'o <tytso@mit.edu> wrote: > > Because it's a stupid, idiotic thing to do. > > The discussion is turning into whether or not linux/tools makes sense > or not. I wish you guys would have had it before perf was merged to > the tree. Perf was IMHO an overreaction caused by the fact that systemtap and oprofile people packaged and released the sources in a way that kernel developers didn't like. I don't think perf should be used as a precendent that now argues that any new kernel utility should be moved into the kernel sources. Does it make sense to move all of mount, fsck, login, etc., into the kernel sources? There are far more kernel tools outside of the kernel sources than inside the kernel sources. - Ted
On Mon, Nov 7, 2011 at 2:47 PM, Ted Ts'o <tytso@mit.edu> wrote: > Perf was IMHO an overreaction caused by the fact that systemtap and > oprofile people packaged and released the sources in a way that kernel > developers didn't like. > > I don't think perf should be used as a precendent that now argues that > any new kernel utility should be moved into the kernel sources. Does > it make sense to move all of mount, fsck, login, etc., into the kernel > sources? There are far more kernel tools outside of the kernel > sources than inside the kernel sources. There's two overlapping questions here: (1) Does it make sense to merge the KVM tool to Linux kernel tree? (2) Does it make sense to merge userspace tools to the kernel tree? I'm not trying to use perf to justify merging the KVM tool. However, you seem to be arguing that it shouldn't be merged because merging userspace tools in general doesn't make sense. That's why I brought up the situation with perf. Pekka
On Mon, Nov 7, 2011 at 2:47 PM, Ted Ts'o <tytso@mit.edu> wrote: > I don't think perf should be used as a precendent that now argues that > any new kernel utility should be moved into the kernel sources. Does > it make sense to move all of mount, fsck, login, etc., into the kernel > sources? There are far more kernel tools outside of the kernel > sources than inside the kernel sources. You seem to think that the KVM tool was developed in isolation and we simply copied the code to tools/kvm for the pull request. That's simply not true. We've done a lot of work to make the code feel like kernel code from locking primitive APIs to serial console emulation register names. We really consider KVM tool to be a new Linux subsystem. It's the long lost cousin or bastard child of KVM, depending on who you ask. I don't know if it makes sense to merge the tools you've mentioned above. My gut feeling is that it's probably not reasonable - there's already a community working on it with their own development process and coding style. I don't think there's a simple answer to this but I don't agree with your rather extreme position that all userspace tools should be kept out of the kernel tree. Pekka
On 11/07/2011 05:57 AM, Ingo Molnar wrote: > > * Pekka Enberg<penberg@cs.helsinki.fi> wrote: > >> On Mon, 7 Nov 2011, Gerd Hoffmann wrote: >>>> It's not just about code, it's as much about culture and development process. >>> >>> Indeed. The BSDs have both kernel and the base system in a single >>> repository. There are probably good reasons for (and against) it. >>> >>> In Linux we don't have that culture. No tool (except perf) lives >>> in the kernel repo. I fail to see why kvm-tool is that much >>> different from udev, util-linux, iproute, filesystem tools, that >>> it should be included. >> >> You seem to think perf is an exception - I think it's going to be >> the future norm for userspace components that are very close to the >> kernel. That's in fact what Ingo was arguing for when he suggested >> QEMU to be merged to the kernel tree. > > Yep, and the answer i got from the Qemu folks when i suggested that > merge was a polite "buzz off", along the lines of: "We don't want to > do that, but feel free to write your own tool, leave Qemu alone." At least it was polite :-) > > Now that people have done exactly that some Qemu folks not only have > changed their objection from "write your own tool" to "erm, write > your own tool but do it the way *we* prefer you to do it" - they also > started contributing *against* the KVM tool with predictable, once > every 3 months objections against its upstream merge... > > That's not very nice and not very constructive. I think it's fair to have an objection to upstream merge but I think these threads are not terribly constructive right now as it's just rehashing the same arguments. I've been thinking about the idea of merging more userspace tools into the kernel. I understand the basic reasoning. The kernel has a strong, established development process. It has good infrastructure and a robust hierarchy of maintainers. Good infrastructure can make a big difference to the success of a project. Expanding the kernel infrastructure to more projects does seem like an obvious thing to do when you think about it in that way. The approach other projects have taken to this is to form a formal incubator. Apache is a good example of this. There are clear (written) rules about what it takes for a project to join. Once a project joins, there's a clear governance structure. The project gets to consume all of the Apache infrastructure resources. Other foundations have a release cadence to ensure that multiple components form a cohesive individual release (oVirt). I think you are trying to do this in a more organic way by just merging things into the main git tree. Have you thought about creating a more formal kernel incubator program? Regards, Anthony Liguori
On Mon, 7 Nov 2011, Pekka Enberg wrote:
> I've never heard ABI incompatibility used as an argument for perf. Ingo?
Never overtly. They're too clever for that.
In any case, as a primary developer of a library (PAPI) that uses the
perf_events ABI I have to say that having perf in the kernel has been a
*major* pain for us.
Unlike the perf developers, we *do* have to maintain backwards
compatability. And we have a lot of nasty code in PAPI to handle this.
Entirely because the perf_events ABI is not stable. It's mostly stable,
but there are enough regressions to be a pain.
It's problem enough that there's no way to know what version of the
perf_event abi you are running against and we have to guess based on
kernel version. This gets "fun" because all of the vendors have
backported seemingly random chunks of perf_event code to their older
kernels.
And it often does seem as the perf developers don't care when something
breaks in perf_events if it doesn't affect perf users.
For example, the new NMI watchdog severely breaks perf_event event
allocation if you are using FORMAT_GROUP. perf doesn't use this though,
so none of the kernel developers seem to care. And unless I can quickly
come up with a patch as an outsider, a few kernel versions will go by and
the kernel devs will declare "well it was broken so long, now we don't
have to fix it". Fun.
Vince
* Vince Weaver <vince@deater.net> wrote: > On Mon, 7 Nov 2011, Pekka Enberg wrote: > > > I've never heard ABI incompatibility used as an argument for > > perf. Ingo? Correct, the ABI has been designed in a way to make it really hard to break the ABI via either directed backports or other mess-ups. The ABI is both backwards *and* forwards ABI compatible, which is very rare amongst Linux ABIs. For frequently used tools, such as perf, there's no ABI compatibility problem in practice: using newer perf on older kernels is pretty common. Using older perf on new kernels is rarer, but that generally works too. In hindsight being in the kernel repo made it *easier* for perf to implement a good, stable ABI while also keeping a very high rate of change of the subsystem: changes are more 'concentrated' and people can stay focused on the ball to extend the ABI in sensible ways instead of struggling with project boundary artifacts. I think we needed to do only one revert along the way in the past two years, to fix an unintended ABI breakage in PowerTop. Considering the total complexity of the perf ABI our compatibility track record is *very* good. > Never overtly. They're too clever for that. Pekka, Vince has meanwhile become the resident perf critic on lkml, always in it when it comes to some perf-bashing: > In any case, as a primary developer of a library (PAPI) that uses > the perf_events ABI I have to say that having perf in the kernel > has been a *major* pain for us. ... and you have argued against perf from the very first day on, when you were one of the perfmon developers - and IMO in hindsight you've been repeatedly wrong about most of your design arguments. > Unlike the perf developers, we *do* have to maintain backwards > compatability. [...] We do too, i use new perf on older distro kernels all the time. If you see a breakage of functionality that tools use and report in a timely fashion then please report it. > [...] And we have a lot of nasty code in PAPI to handle this. > Entirely because the perf_events ABI is not stable. It's mostly > stable, but there are enough regressions to be a pain. You are blaming the wrong guys really. The PAPI project has the (fundamental) problem that you are still doing it in the old-style sw design fashion, with many months long delays in testing, and then you are blaming the problems you inevitably meet with that model on *us*. There was one PAPI incident i remember where it took you several *months* to report a regression in a regular PAPI test-case (no actual app affected as far as i know). No other tester ever ran the PAPI testcases so nobody else reported it. Moving perf out of the kernel would make that particular situation *worse*, by further increasing the latency of fixes and by further increasing the risk of breakages. Sorry, but you are trying to "fix" perf by dragging it down to your bad level of design and we will understandably resist that ... > It's problem enough that there's no way to know what version of the > perf_event abi you are running against and we have to guess based > on kernel version. This gets "fun" because all of the vendors have > backported seemingly random chunks of perf_event code to their > older kernels. The ABI design allows for that kind of flexible extensibility, and it's one of its major advantages. What we *cannot* protect against is you relying on obscure details of the ABI without adding it to 'perf test' and then not testing the upstream kernel in a timely enough fashion either ... Nobody but you tests PAPI so you need to become *part* of the upstream development process, which releases a new upstream kernel every 3 months. > And it often does seem as the perf developers don't care when > something breaks in perf_events if it doesn't affect perf users. I have to reject your slander, both Peter, Arnaldo and me care deeply about fixing regressions and i've personally applied fixes out of order that addressed some sort of PAPI problem - whenever you chose to report them. Vince, you are wrong and you have also become somewhat malicious in your arguments - please stop it. > For example, the new NMI watchdog severely breaks perf_event event > allocation if you are using FORMAT_GROUP. perf doesn't use this > though, so none of the kernel developers seem to care. And unless > I can quickly come up with a patch as an outsider, a few kernel > versions will go by and the kernel devs will declare "well it was > broken so long, now we don't have to fix it". Fun. Face it, the *real* problem is that beyond yourself very few people who use a new kernel use PAPI and your long latency of testing exposes you to breakages in a much more agile subsystem such as perf. Please fix that instead of blaming it on others. Also, as i mentioned it several times before, you are free to add an arbitrary number of ABI test-cases to 'perf test' and we can promise that we run that. Right now it consists of a few tests: $ perf test 1: vmlinux symtab matches kallsyms: Ok 2: detect open syscall event: Ok 3: detect open syscall event on all cpus: Ok 4: read samples using the mmap interface: Ok ... but we do not object to adding testcases for functionality used by PAPI. The usual ABI rules also apply: we'll revert everything that breaks the ABI - but for that you need to report it *in time*, not timed one day before the next -stable release like you did it last time around ... So there's several ways of how you could help push your own interests into the kernel project. Thanks, Ingo
On Mon, 7 Nov 2011, Pekka Enberg wrote: >> I've never heard ABI incompatibility used as an argument for perf. Ingo? On Mon, Nov 7, 2011 at 7:03 PM, Vince Weaver <vince@deater.net> wrote: > Never overtly. They're too clever for that. If you want me to take you seriously, spare me from the conspiracy theories, OK? I'm sure perf developers break the ABI sometimes - that happens elsewhere in the kernel as well. However, Ted claimed that perf developers use tools/perf as an excuse to break the ABI _on purpose_ which is something I have hard time believing. Your snarky remarks doesn't really help this discussion either. It's apparent from the LKML discussions that you're more interested in arguing with the perf developers rather than helping them. Pekka
Ingo Molnar <mingo@elte.hu> writes: > [...] >> It's problem enough that there's no way to know what version of the >> perf_event abi you are running against and we have to guess based >> on kernel version. This gets "fun" because all of the vendors have >> backported seemingly random chunks of perf_event code to their >> older kernels. > > The ABI design allows for that kind of flexible extensibility, and > it's one of its major advantages. > > What we *cannot* protect against is you relying on obscure details of > the ABI [...] Is there some documentation that clearly spells out which parts of the perf syscall userspace ABI are "obscure" and thus presumably changeable? > [...] The usual ABI rules also apply: we'll revert everything that > breaks the ABI - but for that you need to report it *in time* [...] If the ABI is so great in its flexible extensibility, how come it can't be flexibly extended without having to passing the burden of compatibility testing & reversion-yawping to someone else? - FChE
On Mon, 7 Nov 2011, Frank Ch. Eigler wrote: >> The ABI design allows for that kind of flexible extensibility, and >> it's one of its major advantages. >> >> What we *cannot* protect against is you relying on obscure details of >> the ABI [...] > > Is there some documentation that clearly spells out which parts of the > perf syscall userspace ABI are "obscure" and thus presumably > changeable? That's actually something the KVM and virtio folks have done a great job with IMHO. Both ABIs are documented pretty extensively and the specs are kept up to date. I guess for perf ABI, "perf test" is the closest thing to a specification so if your application is using something that's not covered by it, you might be in trouble. Pekka
On Mon, Nov 07, 2011 at 09:53:28PM +0200, Pekka Enberg wrote: > > I'm sure perf developers break the ABI sometimes - that happens > elsewhere in the kernel as well. However, Ted claimed that perf > developers use tools/perf as an excuse to break the ABI _on purpose_ > which is something I have hard time believing. I remember an assertion, probably a year or two ago, probably at the previous year's kernel summit, that one of the reasons for having the perf code inline in the kernel was so that synchronized changes could be made to both the kernel and userspace tool together. So it's not a matter of breaking the ABI _on_ _purpose_, it's an assertion that there is no ABI at all. Since the perf tool and the kernel tool have to be built together, so long as a user does that, no harm, no foul. Recall that Linus has said that he doesn't care about whether or not something is an ABI; he only care if users code don't perceive breakage. If they didn't perceive breakage, then it doesn't matter if an interface is changed. So the real question is whether or not this was an excuse to break the ABI, but whether or not the perf developers acknowledge there is an ABI at all, and whether it's OK for other developers to depend on the syscall interface or not. Actually, though, it shouldn't matter, because intentions don't matter. Recall the powertop/ftrace case. If you expose an interface, and people start using that interface, then you can't break them, period. So as far as Vince is concerned, if you have a userspace library which depends on the perf interface, then you should try out the kernel after each merge window, and if your library breaks, you should complain to Ingo and Linus directly, and request that the commit which broke your tool to be reverted --- because that's the rule; no breakage is allowed. As far as kvm-tool being in the kernel, I still don't see particularly valid arguments for why it should be in the kernel. It can't be the perf argument of "we can make simultaneous changes in the userspace and kernel code", because if those changes break qemu-kvm, then a complaint to Linus will cause the problem code to be reverted. As far as the code using the same coding conventions and naming conventions as the kernel, that to me isn't a particular strong argument either. E2fsprogs uses the Signed-off-by lines, and the same coding conventions of the kernel, and it even has a slightly modified version of two kernel source file in e2fsprogs (e2fsck/recovery.c and e2fsck/revoke.c), plus a header file with data structures that have to be kept in sync with the kernel header file. But that doesn't make it "part of the kernel", and it's not a justification for it to be bundled with the kernel. Personally, I consider code that runs in userspace as a pretty bright line, as being "not kernel code", and while perhaps things like initramfs and the crazy ideas people have had in the past of moving stuff out of kernel/init.c into userspace might have qualified as stuff really close to the kernel, something like kvm-tool that runs way after boot, doesn't even come close. Wine is another example of another package that has lots of close kernel ties, but was also not bundled into the kernel. The precedent has all mainly been on the "keep the kernel separate" side of things, and the arguments for bundling it with the kernel are much weaker, especially since the interface is well-developed, and there are external users of the interface which means you can't make changes to the interface willy-nilly. Indeed, when the perf interface was changing all the time, maybe there was some convenience to have it be bundled with the kernel, so there was no need to negotiate interface version numbers, et. al. But given how it has to link in so many user space libraries, I personally think it's fair to ask the question whether now that it has matured, whether it's time to move it out of the kernel source tree. Regards, - Ted
On Mon, Nov 07, 2011 at 10:09:34PM +0200, Pekka Enberg wrote: > > I guess for perf ABI, "perf test" is the closest thing to a > specification so if your application is using something that's not > covered by it, you might be in trouble. I don't believe there's ever been any guarantee that "perf test" from version N of the kernel will always work on a version N+M of the kernel. Perhaps I am wrong, though. If that is a guarantee that the perf developers are willing to stand behind, or have already made, I would love to be corrected and would be delighted to hear that in fact there is a stable, backwards compatible perf ABI. Regards, - Ted
Hi Ted, On Mon, Nov 7, 2011 at 10:32 PM, Ted Ts'o <tytso@mit.edu> wrote: > Personally, I consider code that runs in userspace as a pretty bright > line, as being "not kernel code", and while perhaps things like > initramfs and the crazy ideas people have had in the past of moving > stuff out of kernel/init.c into userspace might have qualified as > stuff really close to the kernel, something like kvm-tool that runs > way after boot, doesn't even come close. Wine is another example of > another package that has lots of close kernel ties, but was also not > bundled into the kernel. It's not as clear line as you make it out to be. KVM tool also has mini-BIOS code that runs in guest space. It has a code that runs in userspace but is effectively a simple bootloader. So it definitely doesn't fit the simple definition of "running way after boot" (we're _booting_ the kernel too). Linsched fits your definition but is clearly worth integrating to the kernel tree. While you are suggesting that maybe we should move Perf out of the tree now that it's mature, I'm pretty sure you'd agree that it probably would not have happened if the userspace parts were developed out of tree. There's also spectacular failures in the kernel history where the userspace split was enforced. For example, userspace suspend didn't turn out the way people envisioned it at the time. We don't know how it would have worked out if the userspace components would have been in the tree but it certainly would have solved many if the early ABI issues. I guess I'm trying to argue here that there's a middle ground. I'm willing to bet projects like klibc and unified initramfs will eventually make it to the kernel tree because they simply make so much sense. I'm also willing to be that the costs of moving Perf out of the tree are simply too high to make it worthwhile. Does that mean KVM tool should get a free pass in merging? Absolutely not. But I do think your position is too extreme and ignores the benefits of developing userspace tools in the kernel ecosystem which was summed up by Anthony rather well in this thread: https://lkml.org/lkml/2011/11/7/169 Pekka
On 11/07/2011 03:36 PM, Pekka Enberg wrote: > Hi Ted, > > On Mon, Nov 7, 2011 at 10:32 PM, Ted Ts'o<tytso@mit.edu> wrote: >> Personally, I consider code that runs in userspace as a pretty bright >> line, as being "not kernel code", and while perhaps things like >> initramfs and the crazy ideas people have had in the past of moving >> stuff out of kernel/init.c into userspace might have qualified as >> stuff really close to the kernel, something like kvm-tool that runs >> way after boot, doesn't even come close. Wine is another example of >> another package that has lots of close kernel ties, but was also not >> bundled into the kernel. > > It's not as clear line as you make it out to be. > > KVM tool also has mini-BIOS code that runs in guest space. It has a > code that runs in userspace but is effectively a simple bootloader. So > it definitely doesn't fit the simple definition of "running way after > boot" (we're _booting_ the kernel too). > > Linsched fits your definition but is clearly worth integrating to the > kernel tree. While you are suggesting that maybe we should move Perf > out of the tree now that it's mature, I'm pretty sure you'd agree that > it probably would not have happened if the userspace parts were > developed out of tree. > > There's also spectacular failures in the kernel history where the > userspace split was enforced. For example, userspace suspend didn't > turn out the way people envisioned it at the time. We don't know how > it would have worked out if the userspace components would have been > in the tree but it certainly would have solved many if the early ABI > issues. > > I guess I'm trying to argue here that there's a middle ground. I'm > willing to bet projects like klibc and unified initramfs will > eventually make it to the kernel tree because they simply make so much > sense. I'm also willing to be that the costs of moving Perf out of the > tree are simply too high to make it worthwhile. > > Does that mean KVM tool should get a free pass in merging? Absolutely > not. But I do think your position is too extreme and ignores the > benefits of developing userspace tools in the kernel ecosystem which > was summed up by Anthony rather well in this thread: > > https://lkml.org/lkml/2011/11/7/169 The kernel ecosystem does not have to be limited to linux.git. There could be a process to be a "kernel.org project" for projects that fit a certain set of criteria. These projects could all share the Linux kernel release cadence and have a kernel maintainer as a sponsor or something like that. That is something that could potentially benefit things like e2fs-tools and all of the other tools that are tied closely to the kernel. In fact, having a single place where users could find all of the various kernel related tools and helpers would probably be extremely useful. There's no reason this needs to be linux.git though, this could just be a web page on kernel.org. Regards, Anthony Liguori > > Pekka
On Nov 7, 2011, at 5:19 PM, Anthony Liguori wrote: > > The kernel ecosystem does not have to be limited to linux.git. There could be a process to be a "kernel.org project" for projects that fit a certain set of criteria. These projects could all share the Linux kernel release cadence and have a kernel maintainer as a sponsor or something like that. > > That is something that could potentially benefit things like e2fs-tools and all of the other tools that are tied closely to the kernel. We have that already. Packages such as e2fsprogs, xfsprogs, xfstests, sparse, git, etc., have git trees under git.kernel.org. And I agree that's the perfect place for kvm-tool and perf. :-) -- Ted
On Mon, 7 Nov 2011, Ingo Molnar wrote: > I think we needed to do only one revert along the way in the past two > years, to fix an unintended ABI breakage in PowerTop. Considering the > total complexity of the perf ABI our compatibility track record is > *very* good. There have been more breakages, as you know. It's just they weren't caught in time so they were declared to be grandfathered in rather than fixed. > Pekka, Vince has meanwhile become the resident perf critic on lkml, > always in it when it comes to some perf-bashing: For what it's worth you'll find commits from me in the qemu tree, and I also oppose the merge of kvm-tool into the Linux tree. > ... and you have argued against perf from the very first day on, when > you were one of the perfmon developers - and IMO in hindsight you've > been repeatedly wrong about most of your design arguments. I can't find an exact e-mail, but I seem to recall my arguments were that Pentium 4 support would be hard (it was), that in-kernel generalized events were a bad idea (I still think that, try talking to the ARM guys sometime about that) and that making access to raw events hard (by not using a naming library) was silly. I'm sure I probably said other things that were eventually addressed. > The PAPI project has the (fundamental) problem that you are still > doing it in the old-style sw design fashion, with many months long > delays in testing, and then you are blaming the problems you > inevitably meet with that model on *us*. The fundamental problem with the PAPI project is that we only have 3 full-time developers, and we have to make sure PAPI runs on about 10 different platforms, of which perf_events/Linux is only one. Time I waste tracking down perf_event ABI regressions and DoS bugs takes away from actual useful userspace PAPI development. > There was one PAPI incident i remember where it took you several > *months* to report a regression in a regular PAPI test-case (no > actual app affected as far as i know). No other tester ever ran the > PAPI testcases so nobody else reported it. We have a huge userbase. They run on some pretty amazing machines and do some tests that strain perf libraries to the limit. They also tend to use distro kernels, assuming they even have moved to 2.6.31+ kernels yet. When these power users report problems, they aren't going to be against the -tip tree. > Nobody but you tests PAPI so you need to become *part* of the > upstream development process, which releases a new upstream kernel > every 3 months. PAPI is a free software project, with the devel tree available from CVS. It takes maybe 15 minutes to run the full PAPI regression suite. I encourage you or any perf developer to try it and report any issues. I can only be so comprehensive. I didn't find the current NMI-watchdog regression right away because my git tree builds didn't have it enabled. It wasn't until there started being 3.0 distro kernels that people started reporting the problem to us. > Also, as i mentioned it several times before, you are free to add an > arbitrary number of ABI test-cases to 'perf test' and we can promise > that we run that. Right now it consists of a few tests: as mentioned before I have my own perf_event test suite with 20+ tests. http://web.eecs.utk.edu/~vweaver1/projects/perf-events/validation.html I do run it often. It tends to be reactionary though, as I can only add a test for a bug once I know about it. I also have more up-to date perf documentation than the kernel does: http://web.eecs.utk.edu/~vweaver1/projects/perf-events/programming.html and a cpu compatability matrix: http://web.eecs.utk.edu/~vweaver1/projects/perf-events/support.html I didn't really want to turn this into yet another perf flamewar. I just didn't want the implication that perf being in kernel is all rainbows and unicorns to go unchallenged. Vince
* Theodore Tso <tytso@MIT.EDU> wrote: > On Nov 7, 2011, at 5:19 PM, Anthony Liguori wrote: > > > The kernel ecosystem does not have to be limited to linux.git. > > There could be a process to be a "kernel.org project" for > > projects that fit a certain set of criteria. These projects > > could all share the Linux kernel release cadence and have a > > kernel maintainer as a sponsor or something like that. > > > > That is something that could potentially benefit things like > > e2fs-tools and all of the other tools that are tied closely to > > the kernel. > > We have that already. Packages such as e2fsprogs, xfsprogs, > xfstests, sparse, git, etc., have git trees under git.kernel.org. > And I agree that's the perfect place for kvm-tool and perf. :-) I guess this should be a F.A.Q., but it's worth repeating that from the perf tooling project perspective, being integrated into the kernel tree in the past 2-3 years had *numerous* *massive* advantages that improved the project's quality. The shared repo brought countless advantages that a simple kernel.org hosting in a split external tool repo would not have brought. No ifs and when about it, these are the plain facts: - Better features, better ABIs: perf maintainers can enforce clean, functional and usable tooling support *before* committing to an ABI on the kernel side. This is a *huge* deal to improve the quality of the kernel, the ABI and the tooling side and we made use of it a number of times. A perf kernel feature has to come with working, high-quality and usable tooling support - or it won't go upstream. (I could think of numerous other subsystems which would see improvements if they enforced this too.) - We have a shared Git tree with unified, visible version control. I can see kernel feature commits followed by tooling support, in a single flow of related commits: perf probe: Update perf-probe document perf probe: Support --del option trace-kprobe: Support delete probe syntax With two separate Git repositories this kind of connection between the tool and the kernel is inevitably weakened or lost. - Easier development, easier testing: if you work on a kernel feature and on matching tooling support then it's *much* easier to work in a single tree than working in two or more trees in parallel. I have worked on multi-tree features before, and except special exceptions they are generally a big pain to develop. It's not just a developer convenience factor: "big pain" inevitably transforms into "lower quality" as well. - There's a predictable 3 month release cycle of the perf tool, enforced *externally*, by the kernel project. This allowed much easier synchronization of kernel and user-space features and removes version friction. It also guarantees and simplifies the version frequency to packagers and users. - We are using and enforcing established quality control and coding principles of the kernel project. If we mess up then Linus pushes back on us at the last line of defense - and has pushed back on us in the past. I think many of the currently external kernel utilities could benefit from the resulting rise in quality. I've seen separate tool projects degrade into barely usable tinkerware - that i think cannot happen to perf, regardless of who maintains it in the future. - Better debuggability: sometimes a combination of a perf change in combination with a kernel change causes a breakage. I have bisected the shared tree a couple of times already, instead of having to bisect a (100,000 commits x 10,000 commits) combined space which much harder to debug ... - Code reuse: we can and do share source code between the kernel and the tool where it makes sense. Both the tooling and the kernel side code improves from this. (Often explicit librarization makes little sense due to the additional maintenance overhead of a split library project and the impossibly long latency of how the kernel can rely on the ready existence of such a newly created library project.) - [ etc: there's half a dozen of other, smaller positive effects as well. ] Also, while i'm generally pretty good at being the devil's advocate as well, but i've yet to see a *single* serious disadvantage of the shared repo: - Yes, in principle sharing code could be messy - in practice it is not, in fact it cleans things up where we share code and triggers fixes on both sides. Sharing code *works*, as long as there's no artificial project boundary. - Yes, in principle we could end up only testing new-kernel+new-tool and regress older ABI or tool versions. In practice it does not happen disproportionately: people (us developers included) do test the other combinations as well and the ABI has been designed in a way to make it backwards and forwards compatible by default. I think we have messed up a surprisingly small number of times so far, considering the complexity and growth rate of the ABI. - Yes, in principle we could end up being too kernel centric. In practice people are using perf to measure user-space code far more often - and we ourselves use perf to develop perf tooling, which gives an indirect guarantee as well. In our experience, the almost 3 years track record of perf gives a strong validation to the idea that tools that are closely related to the kernel can (and quite likely *should*) prosper in the kernel repo itself. While it was somewhat of an unknowable experiement when we started it 3 years ago, in hindsight it was a no-brainer decision with *many* documented advantages to both to the kernel and to tools/perf/. So we definitely see correlation between tool quality and the shared repo maintenance set-up, and i think the list above gives plenty of reason to suspect causation as well ... Finally, i find it rather weird that the people pushing perf to move out of the kernel have not actually *worked* in such a shared repo scheme yet... None of the perf developers with whom i'm working complained about the shared repo so far - publicly or privately. By all means they are enjoying it and if you look at the stats and results you'll agree that they are highly productive working in that environment. If you look at tools/kvm/ contributors you'll find a very similar mind-set and similar experiences - albeit the project is much younger and smaller. *That is what matters*. So i think you should seriously consider moving your projects *into* tools/ instead of trying to get other projects to move out ... You should at least *try* the unified model before criticising it - because currently you guys are preaching about sex while having sworn a life long celibacy ;-) Thanks, Ingo
On Nov 8, 2011, at 4:32 AM, Ingo Molnar wrote: > > No ifs and when about it, these are the plain facts: > > - Better features, better ABIs: perf maintainers can enforce clean, > functional and usable tooling support *before* committing to an > ABI on the kernel side. "We don't have to be careful about breaking interface compatibility while we are developing new features". The flip side of this is that it's not obvious when an interface is stable, and when it is still subject to change. It makes life much harder for any userspace code that doesn't live in the kernel. And I think we do agree that moving all of userspace into a single git tree makes no sense, right? > - We have a shared Git tree with unified, visible version control. I > can see kernel feature commits followed by tooling support, in a > single flow of related commits: > > perf probe: Update perf-probe document > perf probe: Support --del option > trace-kprobe: Support delete probe syntax > > With two separate Git repositories this kind of connection between > the tool and the kernel is inevitably weakened or lost. "We don't have to clearly document new interfaces between kernel and userspace, and instead rely on git commit order for people to figure out what's going on with some new interface" > - Easier development, easier testing: if you work on a kernel > feature and on matching tooling support then it's *much* easier to > work in a single tree than working in two or more trees in > parallel. I have worked on multi-tree features before, and except > special exceptions they are generally a big pain to develop. I've developed in the split tree systems, and it's really not that hard. It does mean you have to be explicit about designing interfaces up front, and then you have to have a good, robust way of negotiating what features are in the kernel, and what features are supposed by the userspace --- but if you don't do that then having good backwards and forwards compatibility between different versions of the tool simply doesn't exist. So at the end of the day it question is whether you want to be able to (for example) update e2fsck to get better ability to fix more file system corruptions, without needing to upgrade the kernel. If you want to be able to use a newer, better e2fsck with an older, enterprise kernel, then you have use certain programming disciplines. That's where the work is, not in whether you have to maintain two git trees or a single git tree. > - We are using and enforcing established quality control and coding > principles of the kernel project. If we mess up then Linus pushes > back on us at the last line of defense - and has pushed back on us > in the past. I think many of the currently external kernel > utilities could benefit from the resulting rise in quality. > I've seen separate tool projects degrade into barely usable > tinkerware - that i think cannot happen to perf, regardless of who > maintains it in the future. That's basically saying that if you don't have someone competent managing the git tree and providing quality assurance, life gets hard. Sure. But at the same time, does it scale to move all of userspace under one git tree and depending on Linus to push back? I mean, it would have been nice to move all of GNOME 3 under the Linux kernel, so Linus could have pushed back on behalf of all of us power users, but as much as many of us would have appreciated someone being able to push back against the insanity which is the GNOME design process, is that really a good enough excuse to move all of GNOME 3 into the kernel source tree? :-) > - Better debuggability: sometimes a combination of a perf > change in combination with a kernel change causes a breakage. I > have bisected the shared tree a couple of times already, instead > of having to bisect a (100,000 commits x 10,000 commits) combined > space which much harder to debug … What you are describing happens when someone hasn't been careful about their kernel/userspace interfaces. If you have been rigorous with your interfaces, this isn't really an issue. When's the last time we've had to do a NxM exhaustive testing to find a broken sys call ABI between (for example) the kernel and MySQL? > - Code reuse: we can and do share source code between the kernel and > the tool where it makes sense. Both the tooling and the kernel > side code improves from this. (Often explicit librarization makes > little sense due to the additional maintenance overhead of a split > library project and the impossibly long latency of how the kernel > can rely on the ready existence of such a newly created library > project.) How much significant code really can get shared? Memory allocation is different between kernel and userspace code, how you do I/O is different, error reporting conventions are generally different, etc. You might have some serialization and deserialization code which is in common, but (surprise!) that's generally part of your interface, which is hopefully relatively stable especially once the tool and the interface has matured. -- Ted
* Ted Ts'o <tytso@mit.edu> wrote: > I don't believe there's ever been any guarantee that "perf test" > from version N of the kernel will always work on a version N+M of > the kernel. Perhaps I am wrong, though. If that is a guarantee > that the perf developers are willing to stand behind, or have > already made, I would love to be corrected and would be delighted > to hear that in fact there is a stable, backwards compatible perf > ABI. We do even more than that, the perf ABI is fully backwards *and* forwards compatible: you can run older perf on newer ABIs and newer perf on older ABIs. To show you how it works in practice, here's a random cross-compatibility experiment: going back to the perf ABI of 2 years ago. I used v2.6.32 which was just the second upstream kernel with perf released in it. So i took a fresh perf tool version and booted a vanilla v2.6.32 (x86, defconfig, PERF_COUNTERS=y) kernel: $ uname -a Linux mercury 2.6.32 #162137 SMP Tue Nov 8 10:55:37 CET 2011 x86_64 x86_64 x86_64 GNU/Linux $ perf --version perf version 3.1.1927.gceec2 $ perf top Events: 2K cycles 61.68% [kernel] [k] sha_transform 16.09% [kernel] [k] mix_pool_bytes_extract 4.70% [kernel] [k] extract_buf 4.17% [kernel] [k] _spin_lock_irqsave 1.44% [kernel] [k] copy_user_generic_string 0.75% [kernel] [k] extract_entropy_user 0.37% [kernel] [k] acpi_pm_read [the box is running a /dev/urandom stress-test as you can see.] $ perf stat sleep 1 Performance counter stats for 'sleep 1': 0.766698 task-clock # 0.001 CPUs utilized 1 context-switches # 0.001 M/sec 0 CPU-migrations # 0.000 M/sec 177 page-faults # 0.231 M/sec 1,513,332 cycles # 1.974 GHz <not supported> stalled-cycles-frontend <not supported> stalled-cycles-backend 522,609 instructions # 0.35 insns per cycle 65,812 branches # 85.838 M/sec 7,762 branch-misses # 11.79% of all branches 1.076211168 seconds time elapsed The two <not supported> events are not supported by the old kernel - but the other events were and the tool picked them up without bailing out. Regular profiling: $ perf record -a sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.075 MB perf.data (~3279 samples) ] perf report output: $ perf report Events: 1K cycles 64.45% dd [kernel.kallsyms] [k] sha_transform 19.39% dd [kernel.kallsyms] [k] mix_pool_bytes_extract 4.11% dd [kernel.kallsyms] [k] _spin_lock_irqsave 2.98% dd [kernel.kallsyms] [k] extract_buf 0.84% dd [kernel.kallsyms] [k] copy_user_generic_string 0.38% ssh libcrypto.so.0.9.8b [.] lh_insert 0.28% flush-8:0 [kernel.kallsyms] [k] block_write_full_page_endio 0.28% flush-8:0 [kernel.kallsyms] [k] generic_make_request These examples show *PICTURE PERFECT* backwards ABI compatibility, when using the bleeding perf tool on an ancient perf kernel (when it wasnt even called 'perf events' but 'perf counters'). [ Note, i didnt go back to v2.6.31, the oldest upstream perf kernel, because it's such a pain to build with recent binutils and recent GCC ... v2.6.32 already needed a workaround and a couple of .config tweaks to build and boot at all. ] Then i built the ancient v2.6.32 perf tool from 2 years ago: $ perf --version perf version 0.0.2.PERF and booted a fresh v3.1+ kernel: $ uname -a Linux mercury 3.1.0-tip+ #162138 SMP Tue Nov 8 11:14:26 CET 2011 x86_64 x86_64 x86_64 GNU/Linux $ perf stat ls Performance counter stats for 'ls': 1.739193 task-clock-msecs # 0.069 CPUs 0 context-switches # 0.000 M/sec 0 CPU-migrations # 0.000 M/sec 250 page-faults # 0.144 M/sec 3477562 cycles # 1999.526 M/sec 1661460 instructions # 0.478 IPC 839826 cache-references # 482.883 M/sec 15742 cache-misses # 9.051 M/sec 0.025231139 seconds time elapsed $ perf top ------------------------------------------------------------------------------ PerfTop: 38916 irqs/sec kernel:99.6% [100000 cycles], (all, 2 CPUs) ------------------------------------------------------------------------------ samples pcnt kernel function _______ _____ _______________ 41191.00 - 53.1% : sha_transform 20818.00 - 26.8% : mix_pool_bytes_extract 5481.00 - 7.1% : _raw_spin_lock_irqsave 2132.00 - 2.7% : extract_buf 1788.00 - 2.3% : copy_user_generic_string 801.00 - 1.0% : acpi_pm_read 446.00 - 0.6% : _raw_spin_unlock_irqrestore 284.00 - 0.4% : __memset 259.00 - 0.3% : extract_entropy_user $ perf record -a -f sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.034 MB perf.data (~1467 samples) ] $ perf report # Samples: 1023 # # Overhead Command Shared Object Symbol # ........ ............. ................................ ...... # 4.50% swapper [kernel] [k] acpi_pm_read 4.01% swapper [kernel] [k] delay_tsc 2.05% sudo /lib64/libcrypto.so.0.9.8b [.] 0x000000000a0549 1.96% perf [kernel] [k] vsnprintf 1.86% swapper [kernel] [k] test_clear_page_writeback 1.66% perf [kernel] [k] format_decode 1.56% sudo /lib64/ld-2.7.so [.] do_lookup_x These examples show *PICTURE PERFECT* forwards ABI compatibility, using the ancient perf tool on a bleeding edge kernel. During the years we migrated across various transformations of the subsystem and added tons of features, while maintaining the perf ABI. I don't know where the whole ABI argument comes from - perf has argumably one of the best and most compatible tooling ABIs within Linux. I suspect back in the original perf flamewars people made up their mind prematurely that it 'cannot' possibly work and never changed their mind about it, regardless of reality proving them wrong ;-) And yes, the quality of the ABI and tooling cross-compatibility is not accidental at all, it is fully intentional and we take great care that it stays so. More than that we'll gladly take more 'perf test' testcases, for obscure corner-cases that other tools might rely on. I.e. we are willing to help external tooling to get their testcases built into the kernel repo. Note that such level of ABI support is arguably clearly overkill for instrumentation: which by its very nature tends to migrate to the newer versions - still we maintain it because in our opinion good, usable tooling should have a good, extensible ABI. Thanks, Ingo
On Tue, 2011-11-08 at 11:22 +0100, Ingo Molnar wrote: > > We do even more than that, the perf ABI is fully backwards *and* > forwards compatible: you can run older perf on newer ABIs and newer > perf on older ABIs. The ABI yes, the tool no, the tool very much relies on some newer ABI parts. Supporting fallbacks isn't always possible/wanted.
On Nov 8, 2011, at 5:22 AM, Ingo Molnar wrote: > We do even more than that, the perf ABI is fully backwards *and* > forwards compatible: you can run older perf on newer ABIs and newer > perf on older ABIs. It's great to hear that! But in that case, there's an experiment we can't really run, which is if perf had been developed in a separate tree, would it have been just as successful? My belief is that perf was successful because *you* and the other perf developers were competent developers, and who got things right. Not because it was inside the kernel tree. You've argued that things were much better because it was inside the tree, but that's not actually something we can put to a scientific repeatable experiment. I will observe that some of the things that caused me to be come enraged by system tap (such as the fact that I simply couldn't even build the damned thing on a non-Red Hat compilation environment, would not have been solved by moving Systemtap into the kernel git tree --- at least not without moving a large number of its external dependencies into the kernel tree as well, such as the elf library, et. al.) So there is a whole class of problems that were seen in previous tooling systems that were not caused by the fact that they were separate from the kernel, but that they weren't being developed by the kernel developers, so they didn't understand how to make the tool work well for kernel developers. If we had gone back in time, and had the same set of perf developers working in an external tree, and Systemtap and/or Oprofile had been developed in the kernel tree, would it really have made that much difference? Sure, Linus and other kernel developers would have yelled at the Systemtap and Oprofile folks more, but I haven't seen that much evidence that they listened to us when they were outside of the kernel tree, and it's not obvious they would have listened with the code being inside the kernel tree. My claim is that is that outcome wouldn't have been all that different, and that's because the difference was *you*, Ingo Molnar, as a good engineer, would have designed a good backwards compatible ABI whether the code was inside or outside of the kernel, and you would have insisted on good taste and usefulness to kernel programmers whether perf was in our out of the kernel, and you would have insisted on kernel coding guidelines and regular release cycles, even if perf was outside of the kernel. As Linus sometimes like to say, in many cases it's more about the _people_. Regards, -- Ted
On Tue, 8 Nov 2011, Theodore Tso wrote: > It's great to hear that! But in that case, there's an experiment we > can't really run, which is if perf had been developed in a separate > tree, would it have been just as successful? Experiment, eh? We have the staging tree because it's a widely acknowledged belief that kernel code in the tree tends to improve over time compared to code that's sitting out of the tree. Are you disputing that belief? If you don't dispute that, what makes you think the same effect doesn't apply to code that looks like Linux code and is developed the same way but runs in userspace? Pekka
On Nov 8, 2011, at 6:20 AM, Pekka Enberg wrote:
> We have the staging tree because it's a widely acknowledged belief that kernel code in the tree tends to improve over time compared to code that's sitting out of the tree. Are you disputing that belief?
Kernel code in the kernel source tree improves; because that's where it will eventually end up --- linked against the kernel.
There are all sorts of dynamics in play that don't necessarily apply to userspace code.
Otherwise we could just link in all of the userspace code in a Linux distribution and magically expect it will get better, eh? Not!
-- Ted
On Tue, 8 Nov 2011, Theodore Tso wrote: >> We have the staging tree because it's a widely acknowledged belief that >> kernel code in the tree tends to improve over time compared to code >> that's sitting out of the tree. Are you disputing that belief? > > Kernel code in the kernel source tree improves; because that's where it > will eventually end up --- linked against the kernel. > > There are all sorts of dynamics in play that don't necessarily apply to > userspace code. > > Otherwise we could just link in all of the userspace code in a Linux > distribution and magically expect it will get better, eh? Not! You just yourself said it's about the people. Why do you now think it's about linking against the kernel? I know I have hacked on various parts of the kernel that I have never linked to my kernel. Pekka
Hi - On Tue, Nov 08, 2011 at 11:22:35AM +0100, Ingo Molnar wrote: > [...] These examples show *PICTURE PERFECT* forwards ABI > compatibility, using the ancient perf tool on a bleeding edge > kernel. [...] Almost: they demonstrate that those parts of the ABI that these particular perf commands rely on have been impressively compatible. Do you have any sort of ABI coverage measurement, to see what parts of the ABI these perf commands do not use? - FChE
* Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > The ABI yes, the tool no, the tool very much relies on some newer > ABI parts. Supporting fallbacks isn't always possible/wanted. Yeah, sure - and an older tool cannot possibly support newer features either. Thanks, Ingo
On Tue, 8 Nov 2011, Frank Ch. Eigler wrote: > Almost: they demonstrate that those parts of the ABI that these > particular perf commands rely on have been impressively compatible. > Do you have any sort of ABI coverage measurement, to see what > parts of the ABI these perf commands do not use? It's pretty obvious that perf ABI is lacking on that department based on Vince's comments, isn't it? There's an easy fix for this too: improve "perf test" to cover the cases you're intested in. While ABI spec would be a nice addition, it's not going to make compatibility problems magically go away. Pekka
* Vince Weaver <vince@deater.net> wrote: > On Mon, 7 Nov 2011, Ingo Molnar wrote: > > I think we needed to do only one revert along the way in the past > > two years, to fix an unintended ABI breakage in PowerTop. > > Considering the total complexity of the perf ABI our > > compatibility track record is *very* good. > > There have been more breakages, as you know. It's just they > weren't caught in time so they were declared to be grandfathered in > rather than fixed. I remember one such instance were you reported a 'regression' that spanned several -stable kernel releases - and unless the fix is easy and obvious that's the regular upstream treatment. As Linus said it too on the recent Kernel Summit an ABI is only an ABI if it's actually *used*. But there's more, you've repeatedly rejected our offer to extend 'perf test' to cover the functionality that your library relies on. If you refuse to timely test newer upstream kernels while you rely on obscure details that nobody else uses and if you refuse to make your testcases more prominent it becomes *your* problem. There's not much we can do if you refuse to test and refuse to push your testcases upstream ... > > ... and you have argued against perf from the very first day on, > > when you were one of the perfmon developers - and IMO in > > hindsight you've been repeatedly wrong about most of your design > > arguments. > > I can't find an exact e-mail, but I seem to recall my arguments > were that Pentium 4 support would be hard (it was), [...] To the contrary, a single person implemented most of it, out of curiosity. > [...] that in-kernel generalized events were a bad idea (I still > think that, try talking to the ARM guys sometime about that) [...] To the contrary, generalized events work very well and they are one of the reasons why the perf tooling is so usable. > [...] and that making access to raw events hard (by not using a > naming library) was silly. [...] To the contrary, by 'making it easy' you mean 'translate hexa codes to vendor specific gibberish' which is hardly any better to actual users of the tool and gives the false appearance of being a solution. All in one you advocated all the oprofile design mistakes and you have been proven thoroughly wrong by reality. > > The PAPI project has the (fundamental) problem that you are still > > doing it in the old-style sw design fashion, with many months > > long delays in testing, and then you are blaming the problems you > > inevitably meet with that model on *us*. > > The fundamental problem with the PAPI project is that we only have > 3 full-time developers, and we have to make sure PAPI runs on about > 10 different platforms, of which perf_events/Linux is only one. > > Time I waste tracking down perf_event ABI regressions and DoS bugs > takes away from actual useful userspace PAPI development. If people are not interested in even testing the basic test-suite of PAPI on a recent kernel then i'm afraid there must be something very wrong with the PAPI project structure. Somehow that testing is not missing from the perf tool, despite it being a much younger and smaller project. Did you ever stop to think why that is so? > > There was one PAPI incident i remember where it took you several > > *months* to report a regression in a regular PAPI test-case (no > > actual app affected as far as i know). No other tester ever ran > > the PAPI testcases so nobody else reported it. > > We have a huge userbase. They run on some pretty amazing machines > and do some tests that strain perf libraries to the limit. They > also tend to use distro kernels, assuming they even have moved to > 2.6.31+ kernels yet. When these power users report problems, they > aren't going to be against the -tip tree. Nobody expects you to test the -tip tree if you don't want to (it would certainly be useful to you if you are interested in PMU development), but there's a 2.5 months stabilization window after the upstream merge. > > Nobody but you tests PAPI so you need to become *part* of the > > upstream development process, which releases a new upstream > > kernel every 3 months. > > PAPI is a free software project, with the devel tree available from > CVS. It takes maybe 15 minutes to run the full PAPI regression > suite. I encourage you or any perf developer to try it and report > any issues. I will fix what gets reported and neither i nor other regular kernel testers actually use it. You really need to do more testing to fill that gap, expecting others to volunteer time into a project they don't actually use is extremely backwards... > I can only be so comprehensive. I didn't find the current > NMI-watchdog regression right away because my git tree builds > didn't have it enabled. It wasn't until there started being 3.0 > distro kernels that people started reporting the problem to us. > > > Also, as i mentioned it several times before, you are free to add > > an arbitrary number of ABI test-cases to 'perf test' and we can > > promise that we run that. Right now it consists of a few tests: > > as mentioned before I have my own perf_event test suite with 20+ tests. > http://web.eecs.utk.edu/~vweaver1/projects/perf-events/validation.html That should probably be moved into perf test. Arnaldo, any objections? > I do run it often. It tends to be reactionary though, as I can > only add a test for a bug once I know about it. > > I also have more up-to date perf documentation than the kernel does: > http://web.eecs.utk.edu/~vweaver1/projects/perf-events/programming.html > > and a cpu compatability matrix: > http://web.eecs.utk.edu/~vweaver1/projects/perf-events/support.html > > I didn't really want to turn this into yet another perf flamewar. So why then did you launch several malicious, unprovoked, passive-aggressive ad hominem attacks against perf developers, like: "Never overtly. They're too clever for that." and: "Unlike the perf developers, we *do* have to maintain backwards compatability." ? They were untrue, uncalled for, unfair and outright mean-spirited. Thanks, Ingo
* Pekka Enberg <penberg@cs.helsinki.fi> wrote: > [...] There's an easy fix for this too: improve "perf test" to > cover the cases you're intested in. While ABI spec would be a nice > addition, it's not going to make compatibility problems magically > go away. Yes, exactly - 'perf test' has been written with that exact purpose. In practice 'perf' will cover almost all parts of the ABI. The one notable thing that isnt being tested in a natural way is the 'group of events' abstraction - which, ironically, has been added on the perfmon guys' insistence. No app beyond the PAPI self-test makes actual use of it though, which results in an obvious lack of testing. Vince: the code is in tools/perf/builtin-test.c and our offer still stands, feel free to extend it. Maybe there's some other volunteer willing to do that? Thanks, Ingo
On Tue, 2011-11-08 at 13:15 +0100, Ingo Molnar wrote: > > The one notable thing that isnt being tested in a natural way is the > 'group of events' abstraction - which, ironically, has been added on > the perfmon guys' insistence. No app beyond the PAPI self-test makes > actual use of it though, which results in an obvious lack of testing. Also the self monitor stuff, perf-tool doesn't use that for obvious reasons.
* Theodore Tso <tytso@MIT.EDU> wrote: > > On Nov 8, 2011, at 4:32 AM, Ingo Molnar wrote: > > > > No ifs and when about it, these are the plain facts: > > > > - Better features, better ABIs: perf maintainers can enforce clean, > > functional and usable tooling support *before* committing to an > > ABI on the kernel side. > > "We don't have to be careful about breaking interface compatibility > while we are developing new features". See my other mail titled: [F.A.Q.] perf ABI backwards and forwards compatibility the compatibility process works surprisingly well, given the complexity and the flux of changes. From the experience i have with other ABI and feature extension efforts, perf ABI compatibility works comparably better, because the changes always go together so people can review and notice any ABI problems a lot easier than with an artificially fragmented tooling/kernel maintenance setup. I guess you can do well with a split project as well - my main claim is that good compatibility comes *naturally* with integration. Btw., this might explain why iOS and Android is surprisingly compatible as well, despite the huge complexity and the huge flux of changes on both platforms - versus modular approaches like Windows or Linux distros. > The flip side of this is that it's not obvious when an interface is > stable, and when it is still subject to change. [...] ... actual results seem to belie that expectation, right? > [...] It makes life much harder for any userspace code that > doesn't live in the kernel. [...] So *that* is the real argument? As long as compatibility is good, i don't think why that should be the case. Did you consider it a possibility that out of tree projects that have deep ties to the kernel technically seem to be at a relative disadvantage to in-kernel projects because separation is technically costly with the costs of separation being larger than the advantages of separation? > [...] And I think we do agree that moving all of userspace into a > single git tree makes no sense, right? I'm inclined to agree that applications that have no connection and affinity to the kernel (technically or socially) should not live in the kernel repo. (In fact i argue that they should be sandboxed but that's another topic .) But note that there are several OS projects that succeeded doing the equivalent of a 'whole world' single Git repo, so i don't think we have the basis to claim that it *cannot* work. > > - We have a shared Git tree with unified, visible version control. I > > can see kernel feature commits followed by tooling support, in a > > single flow of related commits: > > > > perf probe: Update perf-probe document > > perf probe: Support --del option > > trace-kprobe: Support delete probe syntax > > > > With two separate Git repositories this kind of connection between > > the tool and the kernel is inevitably weakened or lost. > > "We don't have to clearly document new interfaces between kernel > and userspace, and instead rely on git commit order for people to > figure out what's going on with some new interface" It does not prevent the creation of documentation at all - but i argue that the actual *working commits* are more valuable information than the documentation. That inevitably leads to the conclusion that you cannot destroy the more valuable information just to artificially promote the creation of the less valuable piece of information, right? > > - Easier development, easier testing: if you work on a kernel > > feature and on matching tooling support then it's *much* easier to > > work in a single tree than working in two or more trees in > > parallel. I have worked on multi-tree features before, and except > > special exceptions they are generally a big pain to develop. > > I've developed in the split tree systems, and it's really not that > hard. It does mean you have to be explicit about designing > interfaces up front, and then you have to have a good, robust way > of negotiating what features are in the kernel, and what features > are supposed by the userspace --- but if you don't do that then > having good backwards and forwards compatibility between different > versions of the tool simply doesn't exist. I actually think that ext4 is a good example at ABI design - and we borrowed heavily from that positive experience in the perf.data handling code. But i also worked in other projects where the split design worked a lot less smoothly, and arguably ext4 would be *dead* if it had a messy interface design: a persistent filesystem cannot under any circumstance be messy to survive in the long run. Other ABIs, not so much, and we are hurting from that. > So at the end of the day it question is whether you want to be able > to (for example) update e2fsck to get better ability to fix more > file system corruptions, without needing to upgrade the kernel. If > you want to be able to use a newer, better e2fsck with an older, > enterprise kernel, then you have use certain programming > disciplines. That's where the work is, not in whether you have to > maintain two git trees or a single git tree. I demonstrated how this actually works with perf (albeit the compatibility requirements are a lot less severe on perf than with a persistent, on-disk filesystem), do you accept that example as proof? > > - We are using and enforcing established quality control and > > coding principles of the kernel project. If we mess up then > > Linus pushes back on us at the last line of defense - and has > > pushed back on us in the past. I think many of the currently > > external kernel utilities could benefit from the resulting rise > > in quality. I've seen separate tool projects degrade into > > barely usable tinkerware - that i think cannot happen to perf, > > regardless of who maintains it in the future. > > That's basically saying that if you don't have someone competent > managing the git tree and providing quality assurance, life gets > hard. [...] No, it says that we want to *guarantee* that someone competent is maintaining it. If me, Peter and Arnaldo gets hit by the same bus or crashes with the same airplane then i'm pretty confident that life will go on just fine and capable people will pick it up. With an external project i wouldn't be nearly as sure about that - it could be abandonware or could degrade into tinkerware. Working in groups and structuring that way and relying on the infrastructure of a large project is an *advantage* of Linux, why should this surprise *you* of all people, hm? :-) > [...] Sure. But at the same time, does it scale to move all of > userspace under one git tree and depending on Linus to push back? We don't depend on Linus for every single commit, that would be silly and it would not scale. We depend on Linus depending on someone who depends on someone else who depends on someone else. 3 people along that chain would have to make the same bad mistake for crap to get to Linus and while it happens, we try to keep it as rare as humanly possible. > I mean, it would have been nice to move all of GNOME 3 under the > Linux kernel, so Linus could have pushed back on behalf of all of > us power users, [...] You are starting to make sense ;-) > [...] but as much as many of us would have appreciated someone > being able to push back against the insanity which is the GNOME > design process, is that really a good enough excuse to move all of > GNOME 3 into the kernel source tree? :-) Why not? </joking> Seriously, if someone gave me a tools/term/ tool that has rudimentary xterm functionality with tabbing support, written in pure libdri and starting off a basic fbcon console and taking over the full screen, i'd switch to it within about 0.5 nanoseconds and would do most of my daily coding there and would help out with extending it to more apps (starting with a sane mail client perhaps). I'd not expect the Gnome people to move there against their own good judgement - i have no right to do that. (Nor do i think would it be possible technically and socially: the culture friction between those projects is way too large IMO so it's clearly one of the clear 'HELL NO!' cases for integration.) But why do you have to think in absolutes and extremes all the time? Why not excercise some good case by case judgement about the merits of integration versus separation? > > - Better debuggability: sometimes a combination of a perf > > change in combination with a kernel change causes a breakage. I > > have bisected the shared tree a couple of times already, instead > > of having to bisect a (100,000 commits x 10,000 commits) combined > > space which much harder to debug … > > What you are describing happens when someone hasn't been careful > about their kernel/userspace interfaces. What i'm describing is what happens when there are complex bugs that interact in unforeseen ways. > If you have been rigorous with your interfaces, this isn't really > an issue. When's the last time we've had to do a NxM exhaustive > testing to find a broken sys call ABI between (for example) the > kernel and MySQL? MySQL relies on very little on complex kernel facilities. perf on the other hand uses a very complex interface to the kernel and extracts way more structured information from the kernel than MySQL does. That's where the whole "is a tool deeply related to the kernel or not" judgement call starts mattering. Also, i think we have a very clear example of split projects *NOT* working very well when it comes to NxMxO testing matrix: the whole graphics stack ... You *really* need to acknowledge those very real complications and uglies as well when you argue in favor of separation ... > > - Code reuse: we can and do share source code between the kernel > > and the tool where it makes sense. Both the tooling and the > > kernel side code improves from this. (Often explicit > > librarization makes little sense due to the additional > > maintenance overhead of a split library project and the > > impossibly long latency of how the kernel can rely on the ready > > existence of such a newly created library project.) > > How much significant code really can get shared? [...] It's relatively minor right now, but there's possibilities: > [...] Memory allocation is different between kernel and userspace > code, how you do I/O is different, error reporting conventions are > generally different, etc. You might have some serialization and > deserialization code which is in common, but (surprise!) that's > generally part of your interface, which is hopefully relatively > stable especially once the tool and the interface has matured. The KVM tool would like to utilize lockdep for example, to cover user-space locks as well. It already uses the semantics of the kernel locking primitives: disk/qcow.c: mutex_lock(&q->mutex); disk/qcow.c: mutex_unlock(&q->mutex); disk/qcow.c: mutex_unlock(&q->mutex); disk/qcow.c: mutex_unlock(&q->mutex); disk/qcow.c: mutex_unlock(&q->mutex); disk/qcow.c: mutex_lock(&q->mutex); disk/qcow.c: mutex_unlock(&q->mutex); disk/qcow.c: mutex_unlock(&q->mutex); disk/qcow.c: mutex_unlock(&q->mutex); disk/qcow.c: mutex_unlock(&q->mutex); disk/qcow.c: mutex_lock(&q->mutex); ... and lockdep would certainly make sense for such type of "user-space that emulates hardware" while i don't think we'd ever want to go to the overhead of outright librarizing lockdep in an external way. Thanks, Ingo
Em Tue, Nov 08, 2011 at 05:21:50AM -0500, Theodore Tso escreveu: > > On Nov 8, 2011, at 4:32 AM, Ingo Molnar wrote: > > > > No ifs and when about it, these are the plain facts: > > > > - Better features, better ABIs: perf maintainers can enforce clean, > > functional and usable tooling support *before* committing to an > > ABI on the kernel side. > "We don't have to be careful about breaking interface compatibility > while we are developing new features". My normal working environment is an MRG PREEMPT_RT kernel (2.6.33.9, test kernels based on 3.0+) running on enterprise distros while I develop the userspace part. So no, at least for me, I don't keep updating the kernel part while developing userspace. > The flip side of this is that it's not obvious when an interface is > stable, and when it is still subject to change. It makes life much > harder for any userspace code that doesn't live in the kernel. And I > think we do agree that moving all of userspace into a single git tree > makes no sense, right? Right, but that is the extreme as well, right? > > - We have a shared Git tree with unified, visible version control. I > > can see kernel feature commits followed by tooling support, in a > > single flow of related commits: > > > > perf probe: Update perf-probe document > > perf probe: Support --del option > > trace-kprobe: Support delete probe syntax > > > > With two separate Git repositories this kind of connection between > > the tool and the kernel is inevitably weakened or lost. > "We don't have to clearly document new interfaces between kernel and > userspace, and instead rely on git commit order for people to figure > out what's going on with some new interface" Indeed, documentation is lacking, I think coming from a kernel standpoint I relied too much in the "documentation is source code" mantra of old days. But I realize its a necessity and also that regression testing is as well another necessity. I introduced 'perf test' for this later need and rejoice everytime people submit new test cases, like Jiri and Han did in the past, its just that we need more of both, documentation and regression testing. Unfortunately that is not so sexy and I have my hands full not just with perf :-\ > > - Easier development, easier testing: if you work on a kernel > > feature and on matching tooling support then it's *much* easier to > > work in a single tree than working in two or more trees in > > parallel. I have worked on multi-tree features before, and except > > special exceptions they are generally a big pain to develop. > I've developed in the split tree systems, and it's really not that > hard. It does mean you have to be explicit about designing interfaces > up front, and then you have to have a good, robust way of negotiating > what features are in the kernel, and what features are supposed by the > userspace --- but if you don't do that then having good backwards and > forwards compatibility between different versions of the tool simply > doesn't exist. > So at the end of the day it question is whether you want to be able to > (for example) update e2fsck to get better ability to fix more file > system corruptions, without needing to upgrade the kernel. If you > want to be able to use a newer, better e2fsck with an older, > enterprise kernel, then you have use certain programming disciplines. > That's where the work is, not in whether you have to maintain two git > trees or a single git tree. But it can as well be achieved with a single tree, or do you think having a single tree makes that impossible to achieve? As I said I do development basically using the split model at least for testing new tools on older kernels. People using the tools while developing mostly the kernel or both kperf/uperf components do the test on the combined kernel + perf sources. > > - We are using and enforcing established quality control and coding > > principles of the kernel project. If we mess up then Linus pushes > > back on us at the last line of defense - and has pushed back on us > > in the past. I think many of the currently external kernel > > utilities could benefit from the resulting rise in quality. > > I've seen separate tool projects degrade into barely usable > > tinkerware - that i think cannot happen to perf, regardless of who > > maintains it in the future. > That's basically saying that if you don't have someone competent > managing the git tree and providing quality assurance, life gets hard. > Sure. But at the same time, does it scale to move all of userspace > under one git tree and depending on Linus to push back? 8 or 80 again :-\ > I mean, it would have been nice to move all of GNOME 3 under the Linux > kernel, so Linus could have pushed back on behalf of all of us power Sheesh, all of gnome? How closely related and used in kernel development is gnome? gnome 3? > users, but as much as many of us would have appreciated someone being > able to push back against the insanity which is the GNOME design > process, is that really a good enough excuse to move all of GNOME 3 > into the kernel source tree? :-) No, but again, you're taking it to the extreme. - Arnaldo
* Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > On Tue, 2011-11-08 at 13:15 +0100, Ingo Molnar wrote: > > > > The one notable thing that isnt being tested in a natural way is > > the 'group of events' abstraction - which, ironically, has been > > added on the perfmon guys' insistence. No app beyond the PAPI > > self-test makes actual use of it though, which results in an > > obvious lack of testing. > > Also the self monitor stuff, perf-tool doesn't use that for obvious > reasons. Indeed, and that's PAPI's strong point. We could try to utilize it via some clever LD_PRELOAD trickery? Adding a testcase for every bug that can be triggered via tooling would definitely be an improvement as well - those kinds of testcases generally tend to map out the really important bits faster than an attempt at exhaustive testing. Thanks, Ingo
Em Tue, Nov 08, 2011 at 01:07:55PM +0100, Ingo Molnar escreveu: > * Vince Weaver <vince@deater.net> wrote: > > as mentioned before I have my own perf_event test suite with 20+ tests. > > http://web.eecs.utk.edu/~vweaver1/projects/perf-events/validation.html > That should probably be moved into perf test. Arnaldo, any > objections? I'd gladly take patches, I even have in my TODO list for me to volunteer time to do that at some point. If somebody else than me or Vince wants to do that... Assuming there is no licensing problem and Vince doesn't objects for that to be done. I know that at least the QE team at Red Hat uses it and I hope other QE teams do it. - Arnaldo
On Mon, Nov 07, 2011 at 03:12:28PM +0200, Pekka Enberg wrote: > On Mon, Nov 7, 2011 at 2:47 PM, Ted Ts'o <tytso@mit.edu> wrote: > > I don't think perf should be used as a precendent that now argues that > > any new kernel utility should be moved into the kernel sources. Does > > it make sense to move all of mount, fsck, login, etc., into the kernel > > sources? There are far more kernel tools outside of the kernel > > sources than inside the kernel sources. [...] > I don't know if it makes sense to merge the tools you've mentioned above. > My gut feeling is that it's probably not reasonable - there's already a > community working on it with their own development process and coding > style. I don't think there's a simple answer to this but I don't agree with > your rather extreme position that all userspace tools should be kept out > of the kernel tree. Ted's position is not extreme. He follows the simple and exactly defined border between userspace and kernel. The native userspace feature is variability and substitutability. The util-linux package is really nice example: - you don't have to use it, you can use busybox - we have currently three implementation of login(1), many getty implementations, etc. - it's normal that people use the latest util-linux releases with very old kernels (in year 2008 I had report from person with kernel 2.4:-) - userspace is very often about portability -- it's crazy, but some people use some utils from util-linux on Hurd, Solaris and BSD (including very Linux specific things like mkswap and hwclock) Anyway, I agree that small one-man projects are ineffective for important system tools -- it's usually better to merge things into large projects with reliable infrastructure and alive community (here I agree with Lennart's idea to have 3-5 projects for whole low-level userspace). Karel
Hi, > Indeed, documentation is lacking, I think coming from a kernel > standpoint I relied too much in the "documentation is source code" > mantra of old days. Sorry for the shameless plug, but as you are speaking of lacking documentation: Where the heck is the perf config file documented, other than source code? Reading the parser to figure how the config file is supposed to look like really isn't fun :( I'm looking for a way to disable the colors in the perf report tui. Or configure them into something readable. No, light green on light gray which is used by default isn't readable. thanks, Gerd
On Tue, Nov 8, 2011 at 3:29 PM, Karel Zak <kzak@redhat.com> wrote: >> I don't know if it makes sense to merge the tools you've mentioned above. >> My gut feeling is that it's probably not reasonable - there's already a >> community working on it with their own development process and coding >> style. I don't think there's a simple answer to this but I don't agree with >> your rather extreme position that all userspace tools should be kept out >> of the kernel tree. > > Ted's position is not extreme. He follows the simple and exactly defined > border between userspace and kernel. The native userspace feature is > variability and substitutability. It's an extreme position because he's arguing that we should only have kernel code in the tree or we need open up to all userspace code. Pekka
Em Tue, Nov 08, 2011 at 02:40:42PM +0100, Gerd Hoffmann escreveu: > > Indeed, documentation is lacking, I think coming from a kernel > > standpoint I relied too much in the "documentation is source code" > > mantra of old days. > Sorry for the shameless plug, but as you are speaking of lacking Thank you! Its easier when I get the questions for specific problems in the documentation :-) > documentation: Where the heck is the perf config file documented, other > than source code? Reading the parser to figure how the config file is > supposed to look like really isn't fun :( > I'm looking for a way to disable the colors in the perf report tui. Or > configure them into something readable. No, light green on light gray > which is used by default isn't readable. That was fixed in 3.2-rc1, where also we have: [acme@felicio linux]$ cat tools/perf/Documentation/perfconfig.example [colors] # These were the old defaults top = red, lightgray medium = green, lightgray normal = black, lightgray selected = lightgray, magenta code = blue, lightgray [tui] # Defaults if linked with libslang report = on annotate = on top = on [buildid] # Default, disable using /dev/null dir = /root/.debug [acme@felicio linux]$ So you can use: [tui] report = off To disable the TUI altogether or use: $ perf report --stdio Or tweak the colors to your liking. By default the TUI now uses whatever color is configured for your xterm, not something fixed as in the past, which was a common source of complaints, that, unfortunately I only heard indirectly :-\ Ah, if you still need to configure the colors, use "default" so that it will use whatever is the color configured in your xterm/gnome-terminal/whatever profile. For reference, the default set of colors now is (from tools/perf/util/ui/browser.c): static struct ui_browser__colorset { const char *name, *fg, *bg; int colorset; } ui_browser__colorsets[] = { { .colorset = HE_COLORSET_TOP, .name = "top", .fg = "red", .bg = "default", }, { .colorset = HE_COLORSET_MEDIUM, .name = "medium", .fg = "green", .bg = "default", }, { .colorset = HE_COLORSET_NORMAL, .name = "normal", .fg = "default", .bg = "default", }, { .colorset = HE_COLORSET_SELECTED, .name = "selected", .fg = "black", .bg = "lightgray", }, { .colorset = HE_COLORSET_CODE, .name = "code", .fg = "blue", .bg = "default", }, It should all be fixed up now, together with many other improvements that should make the TUI and stdio default user experience similar up till you start using the navigation keys to do things that only are possible with a TUI, like folding/unfolding callchains, etc. Please let me know about any other problem you may find with it! - Arnaldo
On 11/06/2011 03:35 AM, Alexander Graf wrote: > To quickly get going, just execute the following as user: > > $ ./Documentation/run-qemu.sh -r / -a init=/bin/bash > > This will drop you into a shell on your rootfs. > Doesn't work on Fedora 15. F15's qemu-kvm doesn't have -machine or -virtfs. Even qemu.git on F15 won't build virtfs since xattr.h detection is broken (patch posted).
On Tue, Nov 08, 2011 at 04:41:40PM +0200, Avi Kivity wrote: > On 11/06/2011 03:35 AM, Alexander Graf wrote: > > To quickly get going, just execute the following as user: > > > > $ ./Documentation/run-qemu.sh -r / -a init=/bin/bash > > > > This will drop you into a shell on your rootfs. > > > > Doesn't work on Fedora 15. F15's qemu-kvm doesn't have -machine or > -virtfs. Even qemu.git on F15 won't build virtfs since xattr.h > detection is broken (patch posted). Nevermind that running virtfs as a rootfs is a really dumb idea. You do now want to run a VM that has a rootfs that gets changed all the time behind your back. Running qemu -snapshot on the actual root block device is the only safe way to reuse the host installation, although it gets a bit complicated if people have multiple devices mounted into the namespace.
On Tue, Nov 8, 2011 at 4:52 PM, Christoph Hellwig <hch@infradead.org> wrote: > On Tue, Nov 08, 2011 at 04:41:40PM +0200, Avi Kivity wrote: >> On 11/06/2011 03:35 AM, Alexander Graf wrote: >> > To quickly get going, just execute the following as user: >> > >> > $ ./Documentation/run-qemu.sh -r / -a init=/bin/bash >> > >> > This will drop you into a shell on your rootfs. >> > >> >> Doesn't work on Fedora 15. F15's qemu-kvm doesn't have -machine or >> -virtfs. Even qemu.git on F15 won't build virtfs since xattr.h >> detection is broken (patch posted). > > Nevermind that running virtfs as a rootfs is a really dumb idea. You > do now want to run a VM that has a rootfs that gets changed all the > time behind your back. > > Running qemu -snapshot on the actual root block device is the only > safe way to reuse the host installation, although it gets a bit > complicated if people have multiple devices mounted into the namespace. Using block devices also requires root.
On 11/08/2011 04:52 PM, Christoph Hellwig wrote: > On Tue, Nov 08, 2011 at 04:41:40PM +0200, Avi Kivity wrote: > > On 11/06/2011 03:35 AM, Alexander Graf wrote: > > > To quickly get going, just execute the following as user: > > > > > > $ ./Documentation/run-qemu.sh -r / -a init=/bin/bash > > > > > > This will drop you into a shell on your rootfs. > > > > > > > Doesn't work on Fedora 15. F15's qemu-kvm doesn't have -machine or > > -virtfs. Even qemu.git on F15 won't build virtfs since xattr.h > > detection is broken (patch posted). > > Nevermind that running virtfs as a rootfs is a really dumb idea. You > do now want to run a VM that has a rootfs that gets changed all the > time behind your back. True. > Running qemu -snapshot on the actual root block device is the only > safe way to reuse the host installation, although it gets a bit > complicated if people have multiple devices mounted into the namespace. How is -snapshot any different? If the host writes a block after the guest has been launched, but before that block was cowed, then the guest will see the new block. It could work with a btrfs snapshot, but not everyone uses that.
On Tue, Nov 08, 2011 at 04:57:04PM +0200, Avi Kivity wrote: > > Running qemu -snapshot on the actual root block device is the only > > safe way to reuse the host installation, although it gets a bit > > complicated if people have multiple devices mounted into the namespace. > > How is -snapshot any different? If the host writes a block after the > guest has been launched, but before that block was cowed, then the guest > will see the new block. Right, thinko - qemu's snapshots are fairly useless due to sitting ontop of the file to be modified. > It could work with a btrfs snapshot, but not everyone uses that. Or LVM snapshot. Either way, just reusing the root fs without care is a dumb idea, and I really don't want any tool or script that encurages such braindead behaviour in the kernel tree.
On 2011-11-08 15:52, Christoph Hellwig wrote: > On Tue, Nov 08, 2011 at 04:41:40PM +0200, Avi Kivity wrote: >> On 11/06/2011 03:35 AM, Alexander Graf wrote: >>> To quickly get going, just execute the following as user: >>> >>> $ ./Documentation/run-qemu.sh -r / -a init=/bin/bash >>> >>> This will drop you into a shell on your rootfs. >>> >> >> Doesn't work on Fedora 15. F15's qemu-kvm doesn't have -machine or >> -virtfs. Even qemu.git on F15 won't build virtfs since xattr.h >> detection is broken (patch posted). > > Nevermind that running virtfs as a rootfs is a really dumb idea. You > do now want to run a VM that has a rootfs that gets changed all the > time behind your back. > > Running qemu -snapshot on the actual root block device is the only > safe way to reuse the host installation, although it gets a bit > complicated if people have multiple devices mounted into the namespace. I thought about this while hacking a slide on this topic: It's clumsy (compared to -snapshot - my favorite one as well), but you could use some snapshot on the host fs. Or a union fs (if we had an official one) with the write layer directed to some tmpfs area. But what we likely rather want (as it would work without privileges) is built-in write redirection for virtfs. Not an expert on this, but I guess that will have to solve the same problems an in-kernel union fs solution faces, no? Jan
On Tue, Nov 8, 2011 at 4:52 PM, Christoph Hellwig <hch@infradead.org> wrote: > Nevermind that running virtfs as a rootfs is a really dumb idea. You > do now want to run a VM that has a rootfs that gets changed all the > time behind your back. It's rootfs binaries that are shared, not configuration. It's unfortunate but works OK for the single user use case it's meant for. It's obviously not a proper solution for the generic case. We were hoping that we could use something like overlayfs to hide the issue under the rug. Do you think that's also a really dumb thing to do? Using block device snapshotting would be interesting and we should definitely look into that. Pekka
On Tue, Nov 08, 2011 at 05:26:03PM +0200, Pekka Enberg wrote: > On Tue, Nov 8, 2011 at 4:52 PM, Christoph Hellwig <hch@infradead.org> wrote: > > Nevermind that running virtfs as a rootfs is a really dumb idea. ?You > > do now want to run a VM that has a rootfs that gets changed all the > > time behind your back. > > It's rootfs binaries that are shared, not configuration. It's > unfortunate but works OK for the single user use case it's meant for. > It's obviously not a proper solution for the generic case. We were > hoping that we could use something like overlayfs to hide the issue > under the rug. Do you think that's also a really dumb thing to do? It doesn't hide your issues. Any kind of unioning will have massive consistency issues (as in will corrupt your fs if you do stupid things) if the underlying layer is allowed to be written to. Thus all the fuzz about making sure the underlying fs can never be mounted writeable in the union mount patches.
Hi, >> documentation: Where the heck is the perf config file documented, other >> than source code? Reading the parser to figure how the config file is >> supposed to look like really isn't fun :( > >> I'm looking for a way to disable the colors in the perf report tui. Or >> configure them into something readable. No, light green on light gray >> which is used by default isn't readable. > > That was fixed in 3.2-rc1, where also we have: Very cutting edge. /me pulls. > [acme@felicio linux]$ cat tools/perf/Documentation/perfconfig.example Present now, thanks. > [colors] > > # These were the old defaults > top = red, lightgray > medium = green, lightgray > normal = black, lightgray > selected = lightgray, magenta > code = blue, lightgray Seems to have no effect, guess the distro perf binary is too old for that (RHEL-6). > [tui] > > report = off That works. I don't want turn off the tui altogether though, I actually like the interactive expanding+collapsing of the call graphs. I just want turn off the colors. perf_color_default_config() in util/color.c seems to lookup a "color.ui" config variable. Can I set that somehow? Tried ui= in a [color] section -- no effect. > By default the TUI now uses whatever color is configured for your xterm, > not something fixed as in the past, which was a common source of > complaints, that, unfortunately I only heard indirectly :-\ > > Ah, if you still need to configure the colors, use "default" so that it > will use whatever is the color configured in your > xterm/gnome-terminal/whatever profile. > > For reference, the default set of colors now is (from > tools/perf/util/ui/browser.c): > > static struct ui_browser__colorset { > const char *name, *fg, *bg; > int colorset; > } ui_browser__colorsets[] = { > { > .colorset = HE_COLORSET_TOP, > .name = "top", > .fg = "red", > .bg = "default", Bad idea IMO. Setting only one of foreground+background gives pretty much unpredictable results. My xterms have different background colors, the ones with a root shell happen to have a (dark) red background. Which results in red-on-dark-red text. Not good. I'd strongly suggest to either set both background and foreground to default or to set both to a specific color. When doing the latter make sure the colors have enougth contrast so they are readable. cheers, Gerd
On Tue, Nov 08, 2011 at 10:32:25AM +0100, Ingo Molnar wrote: > > None of the perf developers with whom i'm working complained about > the shared repo so far - publicly or privately. By all means they are > enjoying it and if you look at the stats and results you'll agree > that they are highly productive working in that environment. Just because you brought it up. I personally find it awkward to work in the linux tools directory. Maybe this is the reason that I haven't been such a big contributor of perf. I only pushed ktest into the kernel tools directory because people convinced me to do so. Having it there didn't seem to bring in many other developers. Only one other person has contributed to me, and that was just some minor changes. I still find it awkward to work on ktest inside the kernel. I have a separate tree just for ktest, and that means I have all the kernel files sitting there doing nothing just to be able to work on 2 files. Then there's the issue of waiting for Linus to pull from me. I posted my patch set on Oct 28th, and it didn't make it into the merge window. I don't know if Linus had an issue with it, or it just got lost in the noise, as Linus has a lot of other things to worry about. This brings up another question. Does Linus scale? Having more tools in the kernel repo requires Linus to pull from more sources. Or are we just going to have to have a "tools" maintainer. This will give a lot of control to that person who is the gate keeper of the tools directory. Now I've kept trace-cmd and kernelshark outside the kernel tree. I've received lots of patches from other developers for it and some nice new features. It requires me to think hard to keep a nice ABI, and it has been working nicely. The event parsing is working well and there's even a library. But I haven't pushed it too hard because I want this to apply to perf as well. But due to disagreements of where in the kernel tree it belongs, it has been over a year with no progress. Now we waste 4 bytes for every event recording a non existent big kernel lock counter. For recording a million events (which is actually low) that's 4Megs of wasted kernel memory. New tracepoints are going into the kernel all the time, and without a library, we are increasing the chance that more tools will break on changes, and tracepoints will lock down kernel inovation soon if something is not done. Anyway, I'm having surgery tomorrow and have other things to work on. -- Steve
Em Tue, Nov 08, 2011 at 04:38:48PM +0100, Gerd Hoffmann escreveu: > Seems to have no effect, guess the distro perf is too old (RHEL-6). > > [tui] > > report = off > That works. I don't want turn off the tui altogether though, I actually > like the interactive expanding+collapsing of the call graphs. I just > want turn off the colors. > perf_color_default_config() in util/color.c seems to lookup a "color.ui" > config variable. Can I set that somehow? Tried ui= in a [color] > section -- no effect. Ouch, that came from the code initialy stolen^Wcopied from git :-\ I don't think that will have any effect :-\ > > Ah, if you still need to configure the colors, use "default" so that it > > will use whatever is the color configured in your > > xterm/gnome-terminal/whatever profile. > > For reference, the default set of colors now is: > > .colorset = HE_COLORSET_TOP, > > .name = "top", > > .fg = "red", > > .bg = "default", > Bad idea IMO. Setting only one of foreground+background gives pretty > much unpredictable results. My xterms have different background colors, > the ones with a root shell happen to have a (dark) red background. > Which results in red-on-dark-red text. Not good. > I'd strongly suggest to either set both background and foreground to > default or to set both to a specific color. When doing the latter make That is the case for the normal one, two colorsets below the HE_COLORSET_TOP one. Humm, certainly there could be logic to figure it out if background == foreground and do something about it. > sure the colors have enougth contrast so they are readable. Problem is figuring out something that is considered a good default :-\ There will always be somebody that will complain. When doing the coding to allow using the default xterm colors I tried several of the gnome-terminal xterm profiles and all looked kinda sane for the "top" (hottest functions, with most hits) and "medium" lines, where we combine some chosen foreground color ("red" and "green"). Laziest solution would be: If the user customizes that much, could the user please customize this as well? :-) - Arnaldo
On Tue, Nov 08, 2011 at 01:55:09PM +0100, Ingo Molnar wrote: > I guess you can do well with a split project as well - my main claim > is that good compatibility comes *naturally* with integration. Here I have to disagree; my main worry is that integration makes it *naturally* easy for people to skip the hard work needed to keep a stable kernel/userspace interface. The other worry which I've mentioned, but which I haven't seen addressed, is that the even if you can use a perf from a newer kernel with an older kernel, this causes distributions a huge amount of pain, since they have to package two different kernel source packages, and only compile perf from the newer kernel source package. This leads to all sorts of confusion from a distribution packaging point of view. For example, assume that RHEL 5, which is using 2.6.32 or something like that, wants to use a newer e2fsck that does a better job fixing file system corruptions. If it were bundled with the kernel, then they would have to package up the v3.1 kernel sources, and have a source RPM that isn't used for building kernel sources, but just to build a newer version of e2fsck. Fortunately, they don't have to do that. They just pull down a newer version of e2fsprogs, and package, build, test, and ship that. In addition, suppose Red Hat ships a security bug fix which means a new kernel-image RPM has to be shipped. Does that mean that Red Hat has to ship new binary RPM's for any and all tools/* programs that they have packaged as separate RPM's? Or should installing a new kernel RPM also imply dropping new binaries in /usr/bin/perf, et. al? There are all sorts of packaging questions that are raised integration, and from where I sit I don't think they've been adequately solved yet. > Did you consider it a possibility that out of tree projects that have > deep ties to the kernel technically seem to be at a relative > disadvantage to in-kernel projects because separation is technically > costly with the costs of separation being larger than the advantages > of separation? As the e2fsprogs developer, I live with the costs all the time; I can testify to the facy that they are very slight. Occasionally I have to make parallel changes to fs/ext4/ext4.h in the kernel and lib/ext2fs/ext2fs.h in e2fsprogs, and we use various different techniques to detect whether the ext4 kernel code supports a particular feature (we use the presence or absence of some sysfs files), but it's really not been hard for us. > But note that there are several OS projects that succeeded doing the > equivalent of a 'whole world' single Git repo, so i don't think we > have the basis to claim that it *cannot* work. There have indeed, and there has speculation that this was one of many contributions to why they lost out in the popularity and adoption competition with Linux. (Specifically, the reasoning goes that the need to package up the kernel plus userspace meant that we had distributions in the Linux ecosystem, and the competition kept everyone honest. If one distribution started making insane decisions, whether it's forcing Unity on everyone, or forcing GNOME 3 on everyone, it's always possible to switch to another distribution. The *BSD systems didn't have that safety valve....) > But why do you have to think in absolutes and extremes all the time? > Why not excercise some good case by case judgement about the merits > of integration versus separation? I agree that there are tradeoffs to both approaches, and I agree that case by case judgement is something that should be done. One of the reasons why I've spent a lot of time pointing out the downsides of integration and the shortcomings in the integration position is that I've seen advocates claiming that the fact that was perf was integrated was a precedent that meant that choice for kvm-tool was something that should not be questioned since tools/perf justified anything they wanted to do, and that if we wanted to argue about whether kvm-tool should have been bundled into the kernel, we should made different decisions about perf. Regards, - Ted
@Ten Ts'o: you are sponsored by something like microsoft (joking) ? Stop trolling. If you are not familiar with perf, or other tools, save your time and do some useful things.
On 11/08/2011 03:59 PM, Christoph Hellwig wrote: > On Tue, Nov 08, 2011 at 04:57:04PM +0200, Avi Kivity wrote: >>> Running qemu -snapshot on the actual root block device is the only >>> safe way to reuse the host installation, although it gets a bit >>> complicated if people have multiple devices mounted into the namespace. >> How is -snapshot any different? If the host writes a block after the >> guest has been launched, but before that block was cowed, then the guest >> will see the new block. > Right, thinko - qemu's snapshots are fairly useless due to sitting > ontop of the file to be modified. > >> It could work with a btrfs snapshot, but not everyone uses that. > Or LVM snapshot. Either way, just reusing the root fs without care > is a dumb idea, and I really don't want any tool or script that > encurages such braindead behaviour in the kernel tree. Heh, yeah, the intent was obviously to have a separate rootfs tree somewhere in a directory. But that's not available at first when running this, so I figured for a simple "get me rolling" FAQ directing the guest's rootfs to / at least gets you somewhere (especially when run as user with init=/bin/bash). Alex
On 11/08/2011 07:34 PM, Alexander Graf wrote: >> >>> It could work with a btrfs snapshot, but not everyone uses that. >> Or LVM snapshot. Either way, just reusing the root fs without care >> is a dumb idea, and I really don't want any tool or script that >> encurages such braindead behaviour in the kernel tree. > > > Heh, yeah, the intent was obviously to have a separate rootfs tree > somewhere in a directory. But that's not available at first when > running this, so I figured for a simple "get me rolling" FAQ directing > the guest's rootfs to / at least gets you somewhere (especially when > run as user with init=/bin/bash). > Right, init=/bin/bash is not too insane for rootfs passthrough. /proc will be completely broken though, need to mount the guest's.
On Tue, Nov 08, 2011 at 07:14:57PM +0200, Anca Emanuel wrote: > @Ten Ts'o: you are sponsored by something like microsoft (joking) ? > Stop trolling. If you are not familiar with perf, or other tools, save > your time and do some useful things. I am quite familiar with perf. A disagreement with how things are done is not trolling. - Ted
On Tue, 8 Nov 2011, Ted Ts'o wrote: > On Tue, Nov 08, 2011 at 01:55:09PM +0100, Ingo Molnar wrote: > > I guess you can do well with a split project as well - my main claim > > is that good compatibility comes *naturally* with integration. > > Here I have to disagree; my main worry is that integration makes it > *naturally* easy for people to skip the hard work needed to keep a > stable kernel/userspace interface. > > The other worry which I've mentioned, but which I haven't seen > addressed, is that the even if you can use a perf from a newer kernel > with an older kernel, this causes distributions a huge amount of pain, > since they have to package two different kernel source packages, and > only compile perf from the newer kernel source package. This leads to > all sorts of confusion from a distribution packaging point of view. > > For example, assume that RHEL 5, which is using 2.6.32 or something > like that, wants to use a newer e2fsck that does a better job fixing > file system corruptions. If it were bundled with the kernel, then > they would have to package up the v3.1 kernel sources, and have a > source RPM that isn't used for building kernel sources, but just to > build a newer version of e2fsck. Fortunately, they don't have to do > that. They just pull down a newer version of e2fsprogs, and package, > build, test, and ship that. > > In addition, suppose Red Hat ships a security bug fix which means a > new kernel-image RPM has to be shipped. Does that mean that Red Hat > has to ship new binary RPM's for any and all tools/* programs that > they have packaged as separate RPM's? Or should installing a new > kernel RPM also imply dropping new binaries in /usr/bin/perf, et. al? > There are all sorts of packaging questions that are raised > integration, and from where I sit I don't think they've been > adequately solved yet. > This in practice is not a big deal. There are many approaches for how the RPM can be built, but basically getting the perf source is just a matter of make perf-tar-src-pkg or friends such as make perf-tarbz2-src-pkg which will create perf-3.2.0-rc1.tar, and perf-3.2.0-rc1.tar.bz2 respectively which can be used for the src rpms. This tar ball can be used as a separate package or subpackage. Thanks
On Tue, 8 Nov 2011, Arnaldo Carvalho de Melo wrote: > Em Tue, Nov 08, 2011 at 01:07:55PM +0100, Ingo Molnar escreveu: > > * Vince Weaver <vince@deater.net> wrote: > > > as mentioned before I have my own perf_event test suite with 20+ tests. > > > http://web.eecs.utk.edu/~vweaver1/projects/perf-events/validation.html > > > That should probably be moved into perf test. Arnaldo, any > > objections? > > I'd gladly take patches, I even have in my TODO list for me to volunteer > time to do that at some point. > > If somebody else than me or Vince wants to do that... Assuming there is > no licensing problem and Vince doesn't objects for that to be done. I have no objections, though I don't really have time right now to do the work myself. The test code is licensed dual GPLv2/BSD. I should stick that in the package somewhere if I haven't already. My testcases mostly are testing things necessary for proper PAPI functionality and are by no means complete. There are huge areas of perf_event functionality that are not well tested, especially the overflow code. Vince
* Ted Ts'o <tytso@mit.edu> wrote: > On Tue, Nov 08, 2011 at 01:55:09PM +0100, Ingo Molnar wrote: > > > I guess you can do well with a split project as well - my main > > claim is that good compatibility comes *naturally* with > > integration. > > Here I have to disagree; my main worry is that integration makes it > *naturally* easy for people to skip the hard work needed to keep a > stable kernel/userspace interface. There's two observations i have: Firstly, how come that this has not actually happened in practice in the case of perf? Looks like the (random) version compatibility experiment i conducted yesterday should have failed spectacularly. Secondly, within the kernel we don't have a stable ABI - we don't even have stable APIs, and still it's a 15 MLOC project that is thriving. I argue that it is thriving in large part *BECAUSE* we don't have a stable API of any sort: if stuff is broken and the whole world needs to be fixed then we fix the whole world. One could even make the argument that in the special case of deeply kernel integrated tools a stable kernel/userspace interface for those special, Linux-specific ABIs is *too expensive* and results in an inferior end result. I'd really love it if people started thinking outside the box a bit. Why do people assume that *all* of the kernel project's code *has* to run in kernel mode? It's not a valid technical restriction *at all*. "It has been done like this for 30 years" is not a valid technical restriction. Splitting deeply kernel related tools away from the kernel was a valid decision 15 years ago due to kernel image size and similar resource considerations. Today it's less and less true and we are *actively hurting* from tools being split away from the kernel proper. Graphics, storage and user-space suspend are good examples i think of separation gone bad: and the resulting mess has cost Linux distros *the desktop market*. Think about it, the price we pay for this inferior end result is huge. ext4tools is an example of separation gone good. I think it's the exception that strengthens the rule. Why was the 2.4 to 2.6 migration so difficult? I can tell you the distro side story: mainly because the release took too long and tools broke left and right which created stop-ship situations. We had a much larger ABI cross section than we could sanely handle with the testing power we had. So we got into a negative feedback loop: the reduction in 2.3 testers further delayed the release, which moved the (independently evolving ...) tools further away from the to-be-2.6 kernel, which further reduced the effective testing. It was not a sustainable. We addressed many of the problems by shortening the release cycle to 3 months, but IMHO we have not addressed the underlying problem of lack of integration. Responsible release engineering is actually *easier* if you don't have a moving target and if you have the ability to fix stuff that breaks without being bound to an external project. Deeply kernel integrated tools could come in the initrd and could be offered by the kernel, statically linked images made available via /proc/sbin or such. We could even swap them out on demand so there's no RAM overhead. There's no technical barrier. I'd even argue that that C library is obviously something the kernel should offer as well - so klibc is the way to go and would help us further streamline this and keep Linux quality high. We could actually keep the kernel and such tools tightly integrated, reducing the compatibility matrix. The kernel would upgrade with these tools but it *already* upgrades with some user-space components like the vdso so it's not a true technical barrier. > The other worry which I've mentioned, but which I haven't seen > addressed, is that the even if you can use a perf from a newer > kernel with an older kernel, this causes distributions a huge > amount of pain, since they have to package two different kernel > source packages, and only compile perf from the newer kernel source > package. This leads to all sorts of confusion from a distribution > packaging point of view. > > For example, assume that RHEL 5, which is using 2.6.32 or something > like that, wants to use a newer e2fsck that does a better job > fixing file system corruptions. [...] Firstly, it's not a big issue: if a tool comes with the kernel package then it's part of the regular backporting flow: if you backport a new tool to an old kernel then you do the same as if you backported a new kernel feature to an older enterprise kernel. Happens all the time, it's a technological problem with technological solutions. Enterprise distros explicitly do not support cross-distro-version package installs, so backporting will be done anyway. Secondly, i actually think that the obsession with using obsolete kernel versions is silly technologically - and it has evolved that way partly *BECAUSE* we are not integrated enough and distros fear kernel upgrades because it had the bad habit of *breaking tools*. The answer to that problem is to reduce the external cross section of the kernel and make sure that tools upgrade nicely together with the kernel - and integrating tools is a valid way to achieve that. > > Did you consider it a possibility that out of tree projects that > > have deep ties to the kernel technically seem to be at a relative > > disadvantage to in-kernel projects because separation is > > technically costly with the costs of separation being larger than > > the advantages of separation? > > As the e2fsprogs developer, I live with the costs all the time; I > can testify to the facy that they are very slight. [...] Seriously, how can you tell that: you've never tried the integrated approach. I testified to the fact from the first hand experience of having tried both models of development. > > But note that there are several OS projects that succeeded doing > > the equivalent of a 'whole world' single Git repo, so i don't > > think we have the basis to claim that it *cannot* work. > > There have indeed, and there has speculation that this was one of > many contributions to why they lost out in the popularity and > adoption competition with Linux. [...] I don't see Android having "lost out" in any way, do you? I actually see Android as being an obviously more successful approach to Linux on the desktop than anything else seen so far. We should at minimum stop and think about that fact, observe it, learn and adapt. iOS also has not 'lost out' to Linux in any way. > > But why do you have to think in absolutes and extremes all the > > time? Why not excercise some good case by case judgement about > > the merits of integration versus separation? > > I agree that there are tradeoffs to both approaches, and I agree > that case by case judgement is something that should be done. One > of the reasons why I've spent a lot of time pointing out the > downsides of integration and the shortcomings in the integration > position is that I've seen advocates claiming that the fact that > was perf was integrated was a precedent that meant that choice for > kvm-tool was something that should not be questioned since > tools/perf justified anything they wanted to do, and that if we > wanted to argue about whether kvm-tool should have been bundled > into the kernel, we should made different decisions about perf. I don't think Pekka claimed 'anything goes' at all when he asked tools/kvm to be merged upstream - why are you using that strawman argument? He listed numerous valid technological reasons why they decided to work in the tools/kvm/ space and the results speak for themselves. > [...] (Specifically, the reasoning goes that the need to package up > the kernel plus userspace meant that we had distributions in the > Linux ecosystem, and the competition kept everyone honest. If one > distribution started making insane decisions, whether it's forcing > Unity on everyone, or forcing GNOME 3 on everyone, it's always > possible to switch to another distribution. The *BSD systems > didn't have that safety valve....) I don't think your argument makes much sense: how come Linux, a 15 MLOC monster project running for 20 years has not been destroyed by the "lack of the safety valve" problem? Why would adding the at most 1 MLOC deeply kernel related Linux tool and library space to the kernel repo affect the dynamics negatively? We added more code to the kernel last year alone. Fact is, competition thrives within the Linux kernel as well. Why is a coherent, unified, focused project management an impediment to a good technological result? Especially when it comes to desktop computers / tablets / smartphones, where having a unified project is a *must*, so extreme are the requirements of users to get a coherent experience. Think about this plain fact: there's not a single successful smartphone OS on the market that does not have unified project management. Yes, correlation is not causation and such, but still, think about it for a moment. Thanks, Ingo
* Ted Ts'o <tytso@mit.edu> wrote: > On Tue, Nov 08, 2011 at 07:14:57PM +0200, Anca Emanuel wrote: > > @Ten Ts'o: you are sponsored by something like microsoft (joking) > > ? Stop trolling. If you are not familiar with perf, or other > > tools, save your time and do some useful things. > > I am quite familiar with perf. A disagreement with how things are > done is not trolling. Anca, Ted is not trolling me in any fashion. He is a (very successful) tool space and kernel developer and his opinion and experience about how tools should interact with the kernel project is of utmost importance. Clearly Ted thinks that filesystem tools should stay separate from the kernel repo. I agree with him that the case for filesystem tool integration is weaker than for deeply kernel integrated tools such as perf or kvm and calling him a troll is not a way to settle that honest disagreement in any case. Thanks, Ingo
* John Kacur <jkacur@redhat.com> wrote: > On Tue, 8 Nov 2011, Ted Ts'o wrote: > > > On Tue, Nov 08, 2011 at 01:55:09PM +0100, Ingo Molnar wrote: > > > I guess you can do well with a split project as well - my main > > > claim is that good compatibility comes *naturally* with > > > integration. > > > > Here I have to disagree; my main worry is that integration makes > > it *naturally* easy for people to skip the hard work needed to > > keep a stable kernel/userspace interface. > > > > The other worry which I've mentioned, but which I haven't seen > > addressed, is that the even if you can use a perf from a newer > > kernel with an older kernel, this causes distributions a huge > > amount of pain, since they have to package two different kernel > > source packages, and only compile perf from the newer kernel > > source package. This leads to all sorts of confusion from a > > distribution packaging point of view. > > > > For example, assume that RHEL 5, which is using 2.6.32 or > > something like that, wants to use a newer e2fsck that does a > > better job fixing file system corruptions. If it were bundled > > with the kernel, then they would have to package up the v3.1 > > kernel sources, and have a source RPM that isn't used for > > building kernel sources, but just to build a newer version of > > e2fsck. Fortunately, they don't have to do that. They just pull > > down a newer version of e2fsprogs, and package, build, test, and > > ship that. > > > > In addition, suppose Red Hat ships a security bug fix which means > > a new kernel-image RPM has to be shipped. Does that mean that > > Red Hat has to ship new binary RPM's for any and all tools/* > > programs that they have packaged as separate RPM's? Or should > > installing a new kernel RPM also imply dropping new binaries in > > /usr/bin/perf, et. al? There are all sorts of packaging questions > > that are raised integration, and from where I sit I don't think > > they've been adequately solved yet. > > > > This in practice is not a big deal. > > There are many approaches for how the RPM can be built, but basically > getting the perf source is just a matter of > make perf-tar-src-pkg or friends such as > make perf-tarbz2-src-pkg > which will create perf-3.2.0-rc1.tar, and perf-3.2.0-rc1.tar.bz2 > respectively which can be used for the src rpms. This tar ball can be used > as a separate package or subpackage. Great - the 'perf is impossible for distros' was a common counter argument early in the perf project's lifetime - i'm glad it turned out to be bogus in practice. Would it further simplify distro side life if all utilities deeply related to the kernel got built together and came in a single well working package? kutils-3.2.0-rc1.rpm or such. They would always upgrade together with the kernel so there would never be any forced backporting or separate errata pressure, beyond the existing flow of -stable fixes. We do -stable fixes for tools/perf/ as well, for stability/security fixes, naturally - other tools would have to follow the regular kernel maintenance process to manage high priority fixes. Basically distros could rely on the kernel and its utilities being a coherent whole, which is expected to work together, which is maintained and built together and which, if it regresses, is handled by the regular -stable kernel regressions process with high priority. I expect it would grow one by one - it's not like we can or want to force utilities to go into the kernel proper. I'd also expect that new tools would be added initially - not existing ones moved. My question to you would rather be, would it make the life of distro release engineers gradually easier if this space grew gradually over the years, adding more and more critical tool functionality? Thanks, Ingo
* Gerd Hoffmann <kraxel@redhat.com> wrote: > > For reference, the default set of colors now is (from > > tools/perf/util/ui/browser.c): > > > > static struct ui_browser__colorset { > > const char *name, *fg, *bg; > > int colorset; > > } ui_browser__colorsets[] = { > > { > > .colorset = HE_COLORSET_TOP, > > .name = "top", > > .fg = "red", > > .bg = "default", > > Bad idea IMO. Setting only one of foreground+background gives > pretty much unpredictable results. My xterms have different > background colors, the ones with a root shell happen to have a > (dark) red background. Which results in red-on-dark-red text. Not > good. > > I'd strongly suggest to either set both background and foreground > to default or to set both to a specific color. When doing the > latter make sure the colors have enougth contrast so they are > readable. Indeed. What we want to have is to have a set of distinctive colors - just two (background, foreground) colors are not enough - we also need colors to highlight certain information - we need 5-6 colors for the output to be maximally expressive. Is there a canonical way to handle that while still adapting to user preferences automatically by taking background/foreground color scheme of the xterm into account? I suspect to fix the worst of the fallout we could add some logic to detect low contrast combinations (too low color distance) and fall back to the foreground/background colors in that case. Plus allowing full .perfconfig configurability of all the relevant colors, for those with special taste. Thanks, Ingo
* Arnaldo Carvalho de Melo <acme@redhat.com> wrote: > > sure the colors have enougth contrast so they are readable. > > Problem is figuring out something that is considered a good default > :-\ There will always be somebody that will complain. > > When doing the coding to allow using the default xterm colors I > tried several of the gnome-terminal xterm profiles and all looked > kinda sane for the "top" (hottest functions, with most hits) and > "medium" lines, where we combine some chosen foreground color > ("red" and "green"). > > Laziest solution would be: If the user customizes that much, could > the user please customize this as well? :-) I don't think it's acceptable to output unreadable color combinations (red on dark red, etc.) in any case, so we should add some safety mechanism that detects bad color combinations and a fallback, static color scheme. I like the current way how perf top/report adapts to the xterm color scheme. I use it both on dark and white backgrounds and it's easy to mistake it for --stdio output - which is good, a good TUI should blend into the console's color scheme. So i think we should keep that and just detect the few cases where it results in something unreadable. Thanks, Ingo
* Steven Rostedt <rostedt@goodmis.org> wrote: > On Tue, Nov 08, 2011 at 10:32:25AM +0100, Ingo Molnar wrote: > > > > None of the perf developers with whom i'm working complained > > about the shared repo so far - publicly or privately. By all > > means they are enjoying it and if you look at the stats and > > results you'll agree that they are highly productive working in > > that environment. > > Just because you brought it up. > > I personally find it awkward to work in the linux tools directory. > Maybe this is the reason that I haven't been such a big contributor > of perf. [...] Well, this is an argument with a long history we've had from the moment we started perf - i think the main underlying reason for that is that you still see perf as competition to ftrace instead of seeing perf the child of ftrace, the next version of ftrace, the next iterative step of evolution :-/ Unfortunately there's not much that i can do about that beyond telling you that you are IMHO wrong - you as the main ftrace developer thinking that it's competition is a self-fulfilling expectation. Eventually someone will do the right thing and implement 'perf trace' (there's still the tip:tmp.perf/trace2 prototype branch) and users will flock to that workflow because it's so much more intuitive in practice. From what i've seen from the short prototype experiments i've conducted it's a no-brainer superior workflow and design. > [...] I only pushed ktest into the kernel tools directory because > people convinced me to do so. Having it there didn't seem to bring > in many other developers. [...] It was somewhat similar with perf - contributors only arrived after it went upstream, and even then with a delay of a few releases. Also, and it pains me to have to mention it, but putting a .pl script into the kernel repo is not necessarily a reciepe for attracting a lot of developers. We went to great lengths to kill the .cc perf report file in perf, to keep the programming environment familiar to kernel developers and other low level utility folks. Also, obviously a tool has to be important, interesting and has to offer a distinct edge over other tools to attract contributors. Maybe tools/testing/ktest/ does not sound that interesting? Naming also matters: i sure would have moved it to tools/ktest/, its name already suggests that it's about testing, why repeat that twice? Sounds weird. In that sense tools/kvm/ is better than perf: it has already attracted a core group of good, productive contributors despite still being an out of tree fork. The point here was that Pekka & co not just clearly enjoys working on tools/kvm/ and has no trouble attracting contributors, but also *relies* on it being in the kernel tree. Thanks, Ingo
On Tue, 2011-11-08 at 13:59 +0100, Ingo Molnar wrote: > > > Also the self monitor stuff, perf-tool doesn't use that for obvious > > reasons. > > Indeed, and that's PAPI's strong point. > > We could try to utilize it via some clever LD_PRELOAD trickery? Wouldn't be really meaningful, a perf-test case that covers it would be much saner.
Hi, > What we want to have is to have a set of distinctive colors - just > two (background, foreground) colors are not enough - we also need > colors to highlight certain information - we need 5-6 colors for the > output to be maximally expressive. Is there a canonical way to handle > that while still adapting to user preferences automatically by taking > background/foreground color scheme of the xterm into account? > I suspect to fix the worst of the fallout we could add some logic to > detect low contrast combinations (too low color distance) and fall > back to the foreground/background colors in that case. As far I know it is pretty much impossible to figure the foreground/background colors of the terminal you are running on. You can try some guesswork based on $TERM (linux console usually has black background, xterm is white by default), but there will always be cases where it fails. You can run without colors. You can use bold to highlight things and reverse for the cursor. Surely a bit limited and not as pretty as colored, but works for sure everywhere. You can go for a linux-console style black background. Pretty much any color is readable here, so you should have no problems at all to find the 5-6 colors you want. You can go for a xterm-like light background, for example the lightgray used by older perf versions. I like that background color, problem is with most colors the contrast is pretty low. IMHO only red, blue and violet are readable on lightgray. And black of course. > Plus allowing full .perfconfig configurability of all the relevant > colors, for those with special taste. Sure. Maybe also allow multiple color sections and pick them by $TERM or --colors switch, i.e. [colors "xterm"]. cheers, Gerd
On Wed, 09 Nov 2011 11:40:01 +0100, Gerd Hoffmann wrote: > far I know it is pretty much impossible to figure the > foreground/background colors of the terminal you are running on. You > can try some guesswork based on $TERM (linux console usually has black > background, xterm is white by default), but there will always be cases > where it fails. You can make it more explicit, similar to .vimrc: :set background=dark or :set background=light which in turn set the appropriate foreground colors. Hagen
Em Wed, Nov 09, 2011 at 11:40:01AM +0100, Gerd Hoffmann escreveu: > Hi, > > > What we want to have is to have a set of distinctive colors - just > > two (background, foreground) colors are not enough - we also need > > colors to highlight certain information - we need 5-6 colors for the > > output to be maximally expressive. Is there a canonical way to handle > > that while still adapting to user preferences automatically by taking > > background/foreground color scheme of the xterm into account? > > > I suspect to fix the worst of the fallout we could add some logic to > > detect low contrast combinations (too low color distance) and fall > > back to the foreground/background colors in that case. > > As far I know it is pretty much impossible to figure the > foreground/background colors of the terminal you are running on. You Glad to hear that, I thought I hadn't researched that much (I did). Hope somebody appears and tell us how it is done :-) > can try some guesswork based on $TERM (linux console usually has black > background, xterm is white by default), but there will always be cases > where it fails. > > You can run without colors. You can use bold to highlight things and > reverse for the cursor. Surely a bit limited and not as pretty as > colored, but works for sure everywhere. > > You can go for a linux-console style black background. Pretty much any > color is readable here, so you should have no problems at all to find > the 5-6 colors you want. > > You can go for a xterm-like light background, for example the lightgray > used by older perf versions. I like that background color, problem is > with most colors the contrast is pretty low. IMHO only red, blue and > violet are readable on lightgray. And black of course. > > > Plus allowing full .perfconfig configurability of all the relevant > > colors, for those with special taste. > > Sure. Maybe also allow multiple color sections and pick them by $TERM > or --colors switch, i.e. [colors "xterm"]. Its fully configurable as of now, what we need is a set of .perfconfigs that show how people think its better, we try it, set it as the default, leave the others in tools/perf/Documentation/perfconfig/color.examples. - Arnaldo
Em Wed, Nov 09, 2011 at 10:21:09AM +0100, Ingo Molnar escreveu: > Eventually someone will do the right thing and implement 'perf trace' > (there's still the tip:tmp.perf/trace2 prototype branch) and users I'm working on it, reworking its patches into the new evlist/evsel abstractions, etc. - Arnaldo
Hi, >>> Plus allowing full .perfconfig configurability of all the relevant >>> colors, for those with special taste. >> >> Sure. Maybe also allow multiple color sections and pick them by $TERM >> or --colors switch, i.e. [colors "xterm"]. > > Its fully configurable as of now, what we need is a set of .perfconfigs > that show how people think its better, we try it, set it as the default, > leave the others in tools/perf/Documentation/perfconfig/color.examples. Yep, a set of examples works too. The colors are not fully configurable yet though. First, when switching all five colorsets to "default, default" there are still things which are colored (top bar, bottom bar, keys help display). Second there is no way to set terminal attributes (i.e. "top = bold" or "selected = reverse"). cheers, Gerd
Em Wed, Nov 09, 2011 at 01:26:34PM +0100, Gerd Hoffmann escreveu: > Hi, > > >>> Plus allowing full .perfconfig configurability of all the relevant > >>> colors, for those with special taste. > >> > >> Sure. Maybe also allow multiple color sections and pick them by $TERM > >> or --colors switch, i.e. [colors "xterm"]. > > > > Its fully configurable as of now, what we need is a set of .perfconfigs > > that show how people think its better, we try it, set it as the default, > > leave the others in tools/perf/Documentation/perfconfig/color.examples. > > Yep, a set of examples works too. > > The colors are not fully configurable yet though. First, when switching > all five colorsets to "default, default" there are still things which > are colored (top bar, bottom bar, keys help display). Second there is > no way to set terminal attributes (i.e. "top = bold" or "selected = > reverse"). Ok, adding those to the TODO list. /me goes to check if http://perf.wiki.kernel.org is back working so that we can have a _public_ TODO list, perhaps it may attract more contributors :) - Arnaldo
Em Wed, Nov 09, 2011 at 10:30:50AM -0200, Arnaldo Carvalho de Melo escreveu: > Em Wed, Nov 09, 2011 at 01:26:34PM +0100, Gerd Hoffmann escreveu: > > > Its fully configurable as of now, what we need is a set of .perfconfigs > > > that show how people think its better, we try it, set it as the default, > > > leave the others in tools/perf/Documentation/perfconfig/color.examples. > > Yep, a set of examples works too. > > The colors are not fully configurable yet though. First, when switching > > all five colorsets to "default, default" there are still things which > > are colored (top bar, bottom bar, keys help display). Second there is > > no way to set terminal attributes (i.e. "top = bold" or "selected = > > reverse"). > Ok, adding those to my TODO list. > /me goes to check if http://perf.wiki.kernel.org is back working so that > we can have a _public_ TODO list, perhaps it may attract more > contributors :) Oops, there is one, utterly old tho ;-\ I tried changing that and adding this entry but: https://perf.wiki.kernel.org/articles/u/s/e/Special~UserLogin_94cd.html Returns: The requested URL /articles/u/s/e/Special~UserLogin_94cd.html was not found on this server. Ingo, would that G+ page be useful for that? - Arnaldo
On Wed, 2011-11-09 at 10:33 -0200, Arnaldo Carvalho de Melo wrote: > > Ingo, would that G+ page be useful for that? > *groan* Can we please keep things sane?
Em Wed, Nov 09, 2011 at 01:46:42PM +0100, Peter Zijlstra escreveu: > On Wed, 2011-11-09 at 10:33 -0200, Arnaldo Carvalho de Melo wrote: > > > > Ingo, would that G+ page be useful for that? > > > *groan* > > Can we please keep things sane? ROFL, I had to ask that :-P - Arnaldo
* Arnaldo Carvalho de Melo <acme@redhat.com> wrote: > Em Wed, Nov 09, 2011 at 10:30:50AM -0200, Arnaldo Carvalho de Melo escreveu: > > Em Wed, Nov 09, 2011 at 01:26:34PM +0100, Gerd Hoffmann escreveu: > > > > Its fully configurable as of now, what we need is a set of .perfconfigs > > > > that show how people think its better, we try it, set it as the default, > > > > leave the others in tools/perf/Documentation/perfconfig/color.examples. > > > > Yep, a set of examples works too. > > > > The colors are not fully configurable yet though. First, when switching > > > all five colorsets to "default, default" there are still things which > > > are colored (top bar, bottom bar, keys help display). Second there is > > > no way to set terminal attributes (i.e. "top = bold" or "selected = > > > reverse"). > > > Ok, adding those to my TODO list. > > > /me goes to check if http://perf.wiki.kernel.org is back working so that > > we can have a _public_ TODO list, perhaps it may attract more > > contributors :) > > Oops, there is one, utterly old tho ;-\ > > I tried changing that and adding this entry but: > > https://perf.wiki.kernel.org/articles/u/s/e/Special~UserLogin_94cd.html > > Returns: > > The requested URL /articles/u/s/e/Special~UserLogin_94cd.html was not > found on this server. > > Ingo, would that G+ page be useful for that? Not sure - i think perf.wiki.kernel.org is a good place for documentation kind of information. The G+ page is more like for news items. Thanks, Ingo
On Tue, Nov 8, 2011 at 5:32 PM, Ingo Molnar <mingo@elte.hu> wrote: > > So i think you should seriously consider moving your projects *into* > tools/ instead of trying to get other projects to move out ... > > You should at least *try* the unified model before criticising it - > because currently you guys are preaching about sex while having sworn > a life long celibacy ;-) > Ingo, this is making Linux another BSD... manage everything in a single tree... Also, what is your criteria for merging a user-space project into kernel tree? Thanks.
Arnaldo Carvalho de Melo wrote: > Em Wed, Nov 09, 2011 at 11:40:01AM +0100, Gerd Hoffmann escreveu: > > Hi, > > > > > What we want to have is to have a set of distinctive colors - just > > > two (background, foreground) colors are not enough - we also need > > > colors to highlight certain information - we need 5-6 colors for the > > > output to be maximally expressive. Is there a canonical way to handle > > > that while still adapting to user preferences automatically by taking > > > background/foreground color scheme of the xterm into account? > > > > > I suspect to fix the worst of the fallout we could add some logic to > > > detect low contrast combinations (too low color distance) and fall > > > back to the foreground/background colors in that case. > > > > As far I know it is pretty much impossible to figure the > > foreground/background colors of the terminal you are running on. You > > Glad to hear that, I thought I hadn't researched that much (I did). Hope > somebody appears and tell us how it is done :-) In xterm, '\e]10;?\e\\' and '\e]11;?\e\\' will report the colors, e.g.: #!/bin/bash read -s -r -d \\ -p `printf '\e]10;?\e\\'` -t 1 fg [ $? -ne 0 ] && fg="no response" echo "foreground: $fg" | cat -v read -s -r -d \\ -p `printf '\e]11;?\e\\'` -t 1 bg [ $? -ne 0 ] && bg="no response" echo "background: $bg" | cat -v -jim
Em Wed, Nov 09, 2011 at 02:25:09PM -0500, Jim Paris escreveu: > Arnaldo Carvalho de Melo wrote: > > Em Wed, Nov 09, 2011 at 11:40:01AM +0100, Gerd Hoffmann escreveu: > > > As far I know it is pretty much impossible to figure the > > > foreground/background colors of the terminal you are running on. You > > Glad to hear that, I thought I hadn't researched that much (I did). Hope > > somebody appears and tell us how it is done :-) > In xterm, '\e]10;?\e\\' and '\e]11;?\e\\' will report the colors, e.g.: > #!/bin/bash > read -s -r -d \\ -p `printf '\e]10;?\e\\'` -t 1 fg > [ $? -ne 0 ] && fg="no response" > echo "foreground: $fg" | cat -v > read -s -r -d \\ -p `printf '\e]11;?\e\\'` -t 1 bg > [ $? -ne 0 ] && bg="no response" > echo "background: $bg" | cat -v gnome-terminal: [acme@felicio ~]$ ./a.sh foreground: no response background: no response [acme@felicio ~]$ :-( - Arnaldo
"I'd even argue that that C library is obviously something the kernelshould offer as well - so klibc is the way to go and would help usfurther streamline this and keep Linux quality high." I think there is code to share. Why not ?
On 09.11.2011, at 09:23, Ingo Molnar wrote: > > * Ted Ts'o <tytso@mit.edu> wrote: > >> On Tue, Nov 08, 2011 at 01:55:09PM +0100, Ingo Molnar wrote: >> >>> I guess you can do well with a split project as well - my main >>> claim is that good compatibility comes *naturally* with >>> integration. >> >> Here I have to disagree; my main worry is that integration makes it >> *naturally* easy for people to skip the hard work needed to keep a >> stable kernel/userspace interface. > > There's two observations i have: > [...] > I don't think your argument makes much sense: how come Linux, a 15 > MLOC monster project running for 20 years has not been destroyed by > the "lack of the safety valve" problem? Why would adding the at most > 1 MLOC deeply kernel related Linux tool and library space to the > kernel repo affect the dynamics negatively? We added more code to the > kernel last year alone. > > Fact is, competition thrives within the Linux kernel as well. Why is > a coherent, unified, focused project management an impediment to a > good technological result? Especially when it comes to desktop > computers / tablets / smartphones, where having a unified project is > a *must*, so extreme are the requirements of users to get a coherent > experience. > > Think about this plain fact: there's not a single successful > smartphone OS on the market that does not have unified project > management. Yes, correlation is not causation and such, but still, > think about it for a moment. I see your arguments and I think others do too. Look at the BSD or Solaris guys. Heck, even Windows and Mac OS have a lot tighter user-space and kernel bindings than we do. However I don't see any real reason for us who already have the strong syscall ABI boundary as border defined to change that anytime soon. So far it's worked out pretty well IMHO. But yes, if you were to push things from the bottom up, it would even make sense. If you were to push glibc into the kernel it would make sense. I maybe still wouldn't agree with it, but it'd at least be logical, because that's the next layer from the kernel's point of view. If you were to push busybox into the kernel, it would also make sense, so that you can have a fully self-contained system that doesn't need external dependencies built inside a single tree. Again, I wouldn't agree on it because I like user space to be multi platform, but I could see the point. The same goes for udev and systemd. For kvm tool however, I don't. It's very very high up the stack. In fact, I can't imagine too many applications being too much higher up the stack than a VM monitor. It needs to talk to the user (gtk?). It needs to talk to the network (which might be implemented using vde). It needs to talk to storage (which could be hidden behind user space libraries). It basically is a consumer of all the interfaces we provide 50 layers above the kernel. So I find the comparison of pulling GNOME3 and KVM Tool into the kernel fair. Both depend on about the same amount of user space. And even though KVM Tool might not depend on all that much today, I'm sure you guys don't want to limit yourselves in scope just because you're "in the kernel tree". Outside of the kernel tree, you can do your own decisions. If someone thinks it's a great idea to write device emulation in python (I would love that!), he could go in and implement it without having to worry about Linus possibly rejecting it because it's out of scope for a "Linux kernel testing tool". If you want to create the greatest GUI for virtualization the world has ever seen, you can just do it! Nothing holds you back. You already have a very thriving development community. There are active contributers all over the place in KVM Tool. People already are interested in it. Why do you want to be in the kernel tree so badly? I honestly think it would rather hurt the project rather than help it. So in all honesty, I wish for a KVM Tool outside of the kernel tree so it can thrive and evolve into something great - without artificial borders. And I'm sure most of the KVM Tool developers wish for the thriving part as well - which I believe can not happen inside the kernel tree. Alex
* Américo Wang <xiyou.wangcong@gmail.com> wrote: > On Tue, Nov 8, 2011 at 5:32 PM, Ingo Molnar <mingo@elte.hu> wrote: > > > > So i think you should seriously consider moving your projects > > *into* tools/ instead of trying to get other projects to move out > > ... > > > > You should at least *try* the unified model before criticising it > > - because currently you guys are preaching about sex while having > > sworn a life long celibacy ;-) > > Ingo, this is making Linux another BSD... manage everything in a > single tree... It's not an all-or-nothing prospect. Linux user-space consists of well in excess of 200 MLOC code. The kernel is 15 MLOC. I think the system-bound utilities that 'obviously' qualify for kernel inclusion are around 1 MLOC in total size, i.e. less than 0.5% of all user-space. > Also, what is your criteria for merging a user-space project into > kernel tree? Well, my criteria go roughly along these lines: 1) The developers use that model and are productive that way and produce a tool that has a significant upside. 2) There's significant Linux-specific interactions between the user-space project and the kernel. 3) The code is clean, well designed and follows the various principles laid out in Documentation/CodingStyle and Documentation/ManagementStyle so that it can be merged into a prominent spot in the kernel tree and the project is ready to live with the (non-trivial!) consequences of all that: - the project does -stable kernel backports of serious bugs - the project follows a strict "no regressions" policy - the project follows the kernel release cycle of 'Winter', 'Spring', 'Summer' and 'Autumn' releases and follows the merge window requirements and implements the post-rc1 stabilization cycle. These are not easy requirements and i can well imagine that many projects, even if they qualified on all other counts, would prefer to stay out of tree than be subject to such strict release engineering constraints. Also, the requirements can be made stricter with time, based on positive and negative experiences. Projects can 'die' and move out of the kernel as well if the kernel repo did not work out for them. As long as it's all done gradually and on a case by case basis Linux can only benefit from this. Thanks, Ingo
* Anca Emanuel <anca.emanuel@gmail.com> wrote: > "I'd even argue that that C library is obviously something the > kernelshould offer as well - so klibc is the way to go and would > help usfurther streamline this and keep Linux quality high." > > I think there is code to share. Why not ? The biggest downside of libc integration into the kernel would be that the libc ABI is *vastly* larger than the kernel ABI, and i'm not sure the kernel community is good enough to handle that. It's roughly 3000 ABI components compared to the 300 ABI functions the kernel has today - so at least an order of magnitude larger... The biggest upside of libc integration into the kernel would be that we could push Linux kernel improvements into the C library - and thus to apps - immediately, along a much larger ABI surface. The 'specialization' resolution of the libc ABI is an order of magnitude larger than that of the kernel's, giving many more opportunities for good, workload specific optimizations and unique solutions. Today the latency of getting a kernel improvement to applications via a change in the C library is above a year, so most kernel people don't actually try to improve the C library but try to find improvements on the kernel level which gets to a distro within a couple of months. If the kernel offers a /proc/libc.so.6 library then the kernel will always be 'in sync' with the library (there's no library to install on-disk - it would be offered by the kernel) and we could use integration techniques like the vDSO uses today. Thanks, Ingo
[offtopic] Any news from Mathieu Desnoyers "Generic Ring Buffer Library" http://www.efficios.com/ringbuffer ?
* Alexander Graf <agraf@suse.de> wrote: > [...] > > Outside of the kernel tree, you can do your own decisions. If > someone thinks it's a great idea to write device emulation in > python (I would love that!), he could go in and implement it > without having to worry about Linus possibly rejecting it because > it's out of scope for a "Linux kernel testing tool". If you want to > create the greatest GUI for virtualization the world has ever seen, > you can just do it! Nothing holds you back. We actually recently added Python bindings to event tracing in perf: earth5:~/tip> find tools/perf/ -name '*.py' tools/perf/python/twatch.py tools/perf/util/setup.py tools/perf/scripts/python/Perf-Trace-Util/lib/Perf/Trace/Util.py tools/perf/scripts/python/Perf-Trace-Util/lib/Perf/Trace/Core.py tools/perf/scripts/python/Perf-Trace-Util/lib/Perf/Trace/SchedGui.py tools/perf/scripts/python/syscall-counts.py tools/perf/scripts/python/sctop.py tools/perf/scripts/python/sched-migration.py tools/perf/scripts/python/check-perf-trace.py tools/perf/scripts/python/futex-contention.py tools/perf/scripts/python/failed-syscalls-by-pid.py tools/perf/scripts/python/net_dropmonitor.py tools/perf/scripts/python/syscall-counts-by-pid.py tools/perf/scripts/python/netdev-times.py ... and Linus did not object (so far ;-) - nor does he IMHO have many reasons to object as long as the code is sane and useful. Nor did Linus object when perf extended its scope from profiling to tracing, system monitoring, etc. While i don't talk for Linus, the only 'hard boundary' that Linus enforces and expects all maintainers to enforce that i'm aware of is "don't do crazy crap". Everything else is possible as long as it's high quality and reasonable, with a good upside story that is relevant to the kernel - you can let your imagination run wild, there's no artificial barriers that i'm aware of. Anyway, i have outlined the rough consequences of a user-space project being inside the kernel repo in this post: http://lkml.org/lkml/2011/11/10/86 ... and they are definitely not trivial and easy to meet. Thanks, Ingo
Hi, >>> As far I know it is pretty much impossible to figure the >>> foreground/background colors of the terminal you are running on. You >> >> Glad to hear that, I thought I hadn't researched that much (I did). Hope >> somebody appears and tell us how it is done :-) > > In xterm, '\e]10;?\e\\' and '\e]11;?\e\\' will report the colors, e.g.: > > #!/bin/bash > read -s -r -d \\ -p `printf '\e]10;?\e\\'` -t 1 fg > [ $? -ne 0 ] && fg="no response" > echo "foreground: $fg" | cat -v > read -s -r -d \\ -p `printf '\e]11;?\e\\'` -t 1 bg > [ $? -ne 0 ] && bg="no response" > echo "background: $bg" | cat -v Works fine in xterm. Neither gnome-terminal (i.e. vte widget) nor konsole support this though. cheers, Gerd
diff --git a/tools/testing/run-qemu/run-qemu.sh b/tools/testing/run-qemu/run-qemu.sh new file mode 100755 index 0000000..70f194f --- /dev/null +++ b/tools/testing/run-qemu/run-qemu.sh @@ -0,0 +1,338 @@ +#!/bin/bash +# +# QEMU Launcher +# +# This script enables simple use of the KVM and QEMU tool stack for +# easy kernel testing. It allows to pass either a host directory to +# the guest or a disk image. Example usage: +# +# Run the host root fs inside a VM: +# +# $ ./scripts/run-qemu.sh -r / +# +# Run the same with SDL: +# +# $ ./scripts/run-qemu.sh -r / --sdl +# +# Or with a PPC build: +# +# $ ARCH=ppc ./scripts/run-qemu.sh -r / +# +# PPC with a mac99 model by passing options to QEMU: +# +# $ ARCH=ppc ./scripts/run-qemu.sh -r / -- -M mac99 +# + +USE_SDL= +USE_VNC= +USE_GDB=1 +KERNEL_BIN=arch/x86/boot/bzImage +MON_STDIO= +KERNEL_APPEND2= +SERIAL=ttyS0 +SERIAL_KCONFIG=SERIAL_8250 +BASENAME=$(basename "$0") + +function usage() { + echo " +$BASENAME allows you to execute a virtual machine with the Linux kernel +that you just built. To only execute a simple VM, you can just run it +on your root fs with \"-r / -a init=/bin/bash\" + + -a, --append parameters + Append the given parameters to the kernel command line. + + -d, --disk image + Add the image file as disk into the VM. + + -D, --no-gdb + Don't run an xterm with gdb attached to the guest. + + -r, --root directory + Use the specified directory as root directory inside the guest. + + -s, --sdl + Enable SDL graphical output. + + -S, --smp cpus + Set number of virtual CPUs. + + -v, --vnc + Enable VNC graphical output. + +Examples: + + Run the host root fs inside a VM: + $ ./scripts/run-qemu.sh -r / + + Run the same with SDL: + $ ./scripts/run-qemu.sh -r / --sdl + + Or with a PPC build: + $ ARCH=ppc ./scripts/run-qemu.sh -r / + + PPC with a mac99 model by passing options to QEMU: + $ ARCH=ppc ./scripts/run-qemu.sh -r / -- -M mac99 +" +} + +function require_config() { + if [ "$(grep CONFIG_$1=y .config)" ]; then + return + fi + + echo "You need to enable CONFIG_$1 for run-qemu to work properly" + exit 1 +} + +function has_config() { + grep -q "CONFIG_$1=y" .config +} + +function drive_if() { + if has_config VIRTIO_BLK; then + echo virtio + elif has_config ATA_PIIX; then + echo ide + else + echo "\ +Your kernel must have either VIRTIO_BLK or ATA_PIIX +enabled for block device assignment" >&2 + exit 1 + fi +} + +GETOPT=`getopt -o a:d:Dhr:sS:v --long append,disk:,no-gdb,help,root:,sdl,smp:,vnc \ + -n "$(basename \"$0\")" -- "$@"` + +if [ $? != 0 ]; then + echo "Terminating..." >&2 + exit 1 +fi + +eval set -- "$GETOPT" + +while true; do + case "$1" in + -a|--append) + KERNEL_APPEND2="$KERNEL_APPEND2 $KERNEL_APPEND2" + shift + ;; + -d|--disk) + QEMU_OPTIONS="$QEMU_OPTIONS -drive \ + file=$2,if=$(drive_if),cache=unsafe" + USE_DISK=1 + shift + ;; + -D|--no-gdb) + USE_GDB= + ;; + -h|--help) + usage + exit 0 + ;; + -r|--root) + ROOTFS="$2" + shift + ;; + -s|--sdl) + USE_SDL=1 + ;; + -S|--smp) + SMP="$2" + shift + ;; + -v|--vnc) + USE_VNC=1 + ;; + --) + shift + break + ;; + *) + echo "Could not parse option: $1" >&2 + exit 1 + ;; + esac + shift +done + +if [ ! "$ROOTFS" -a ! "$USE_DISK" ]; then + echo "\ +Error: Please specify at least -r or -d with a target \ +FS to run off of" >&2 + exit 1 +fi + +# Try to find the KVM accelerated QEMU binary + +[ "$ARCH" ] || ARCH=$(uname -m) +case $ARCH in +x86_64) + KERNEL_BIN=arch/x86/boot/bzImage + # SUSE and Red Hat call the binary qemu-kvm + [ "$QEMU_BIN" ] || QEMU_BIN=$(which qemu-kvm 2>/dev/null) + + # Debian and Gentoo call it kvm + [ "$QEMU_BIN" ] || QEMU_BIN=$(which kvm 2>/dev/null) + + # QEMU's own build system calls it qemu-system-x86_64 + [ "$QEMU_BIN" ] || QEMU_BIN=$(which qemu-system-x86_64 2>/dev/null) + ;; +i*86) + KERNEL_BIN=arch/x86/boot/bzImage + # SUSE and Red Hat call the binary qemu-kvm + [ "$QEMU_BIN" ] || QEMU_BIN=$(which qemu-kvm 2>/dev/null) + + # Debian and Gentoo call it kvm + [ "$QEMU_BIN" ] || QEMU_BIN=$(which kvm 2>/dev/null) + + KERNEL_BIN=arch/x86/boot/bzImage + # i386 version of QEMU + [ "$QEMU_BIN" ] || QEMU_BIN=$(which qemu 2>/dev/null) + ;; +s390*) + KERNEL_BIN=arch/s390/boot/image + [ "$QEMU_BIN" ] || QEMU_BIN=$(which qemu-system-s390x 2>/dev/null) + ;; +ppc*) + KERNEL_BIN=vmlinux + + IS_64BIT= + has_config PPC64 && IS_64BIT=64 + if has_config PPC_85xx; then + QEMU_OPTIONS="$QEMU_OPTIONS -M mpc8544ds" + elif has_config PPC_PSERIES; then + QEMU_OPTIONS="$QEMU_OPTIONS -M pseries" + SERIAL=hvc0 + SERIAL_KCONFIG=HVC_CONSOLE + elif has_config PPC_PMAC; then + has_config SERIAL_PMACZILOG_TTYS || SERIAL=ttyPZ0 + SERIAL_KCONFIG=SERIAL_PMACZILOG + else + echo "Unknown PPC board" >&2 + exit 1 + fi + + [ "$QEMU_BIN" ] || QEMU_BIN=$(which qemu-system-ppc${IS_64BIT} 2>/dev/null) + ;; +esac + +if [ ! -e "$QEMU_BIN" ]; then + echo "\ +Could not find a usable QEMU binary. Please install one from \ +your distro or from source code using: + + $ git clone git://git.qemu.org/qemu.git + $ cd qemu + $ ./configure + $ make -j + $ sudo make install +" >&2 + exit 1 +fi + +# The binaries without kvm in their name can be too old to support KVM, so +# check for that before the user gets confused +if [ ! "$(echo $QEMU_BIN | grep kvm)" -a \ + ! "$($QEMU_BIN --help | egrep '^-machine')" ]; then + echo "Your QEMU binary is too old, please update to at least 0.15." >&2 + exit 1 +fi +QEMU_OPTIONS="$QEMU_OPTIONS -machine accel=kvm:tcg" + +# We need to check some .config variables to make sure we actually work +# on the respective kernel. +if [ ! -e .config ]; then + echo "\ +Please run this script on a fully compiled and configured +Linux kernel build directory" >&2 + exit 1 +fi + +if [ ! -e "$KERNEL_BIN" ]; then + echo "Could not find kernel binary: $KERNEL_BIN" >&2 + exit 1 +fi + +QEMU_OPTIONS="$QEMU_OPTIONS -kernel $KERNEL_BIN" + +if [ "$USE_SDL" ]; then + # SDL is the default, so nothing to do + : +elif [ "$USE_VNC" ]; then + QEMU_OPTIONS="$QEMU_OPTIONS -vnc :5" +else + # When emulating a serial console, tell the kernel to use it as well + QEMU_OPTIONS="$QEMU_OPTIONS -nographic" + KERNEL_APPEND="$KERNEL_APPEND console=$SERIAL earlyprintk=serial" + MON_STDIO=1 + require_config "$SERIAL_KCONFIG" +fi + +if [ "$ROOTFS" ]; then + # Using rootfs with 9p + require_config "NET_9P_VIRTIO" + KERNEL_APPEND="$KERNEL_APPEND \ +root=/dev/root rootflags=rw,trans=virtio,version=9p2000.L rootfstype=9p" + +#Usage: -virtfs fstype,path=/share_path/,security_model=[mapped|passthrough|none],mount_tag=tag. + + + QEMU_OPTIONS="$QEMU_OPTIONS \ +-virtfs local,id=root,path=$ROOTFS,mount_tag=root,security_model=passthrough \ +-device virtio-9p-pci,fsdev=root,mount_tag=/dev/root" +fi + +[ "$SMP" ] || SMP=1 + +# User append args come last +KERNEL_APPEND="$KERNEL_APPEND $KERNEL_APPEND2" + +############### Execution ################# + +QEMU_OPTIONS="$QEMU_OPTIONS -smp $SMP" + +echo " + ################# Linux QEMU launcher ################# + +This script executes your currently built Linux kernel using QEMU. If KVM is +available, it will also use KVM for fast virtualization of your guest. + +The intent is to make it very easy to run your kernel. If you need to do more +advanced things, such as passing through real devices, please use QEMU command +line options and add them to the $BASENAME command line using --. + +This tool is for simplicity, not world dominating functionality coverage. +(just a hobby, won't be big and professional like libvirt) + +" + +if [ "$MON_STDIO" ]; then + echo "\ +### Your guest is bound to the current foreground shell. To quit the guest, ### +### please use Ctrl-A x ### +" +fi + +echo -n " Executing: $QEMU_BIN $QEMU_OPTIONS -append \"$KERNEL_APPEND\" " +for i in "$@"; do + echo -n "\"$i\" " +done +echo +echo + +GDB_PID= +if [ "$USE_GDB" -a "$DISPLAY" -a -x "$(which xterm)" -a -e "$(which gdb)" ]; then + # Run a gdb console in parallel to the kernel + + # XXX find out if port is in use + PORT=$(( $$ + 1024 )) + xterm -T "$BASENAME" -e "sleep 2; gdb vmlinux -ex 'target remote localhost:$PORT' -ex c" & + GDB_PID=$! + QEMU_OPTIONS="$QEMU_OPTIONS -gdb tcp::$PORT" +fi + +$QEMU_BIN $QEMU_OPTIONS -append "$KERNEL_APPEND" "$@" +wait $GDB_PID &>/dev/null +
On LinuxCon I had a nice chat with Linus on what he thinks kvm-tool would be doing and what he expects from it. Basically he wants a small and simple tool he and other developers can run to try out and see if the kernel they just built actually works. Fortunately, QEMU can do that today already! The only piece that was missing was the "simple" piece of the equation, so here is a script that wraps around QEMU and executes a kernel you just built. If you do have KVM around and are not cross-compiling, it will use KVM. But if you don't, you can still fall back to emulation mode and at least check if your kernel still does what you expect. I only implemented support for s390x and ppc there, but it's easily extensible to more platforms, as QEMU can emulate (and virtualize) pretty much any platform out there. If you don't have qemu installed, please do so before using this script. Your distro should provide a package for it (might even call it "kvm"). If not, just compile it from source - it's not hard! To quickly get going, just execute the following as user: $ ./Documentation/run-qemu.sh -r / -a init=/bin/bash This will drop you into a shell on your rootfs. Happy hacking! Signed-off-by: Alexander Graf <agraf@suse.de> --- v1 -> v2: - fix naming of QEMU - use grep -q for has_config - support multiple -a args - spawn gdb on execution - pass through qemu options - dont use qemu-system-x86_64 on i386 - add funny sentence to startup text - more helpful error messages v2 -> v3: - move to tools/testing - fix running: message ( sorry for sending this version so late - I got caught up in random other stuff ) --- tools/testing/run-qemu/run-qemu.sh | 338 ++++++++++++++++++++++++++++++++++++ 1 files changed, 338 insertions(+), 0 deletions(-) create mode 100755 tools/testing/run-qemu/run-qemu.sh