Message ID | DA586906BA1FFC4384FCFD6429ECE86035293492@shzsmsx502.ccr.corp.intel.com |
---|---|
State | New |
Headers | show |
On 01/26/2010 09:25 PM, Zheng, Jiajia wrote: > Add command p2v to translate Guest physical address to Host virtual address. > For what purpose? > Signed-off-by: Max Asbock<masbock@linux.vnet.ibm.com> > Jiajia Zheng<jiajia.zheng@intel.com> > --- > diff --git a/monitor.c b/monitor.c > index b33b01f..83d9ac7 100644 > --- a/monitor.c > +++ b/monitor.c > @@ -668,6 +668,11 @@ static void do_info_uuid(Monitor *mon, QObject **ret_data) > *ret_data = qobject_from_jsonf("{ 'UUID': %s }", uuid); > } > > +static void do_info_p2v(Monitor *mon) > +{ > + monitor_printf(mon, "p2v implemented\n"); > +} > These should be implemented as QMP commands. > /* get the current CPU defined by the user */ > static int mon_set_cpu(int cpu_index) > { > @@ -2283,6 +2288,14 @@ static void do_inject_mce(Monitor *mon, const QDict *qdict) > break; > } > } > +static void do_p2v(Monitor *mon, const QDict *qdict) > +{ > + target_long size = 4096; > + target_long addr = qdict_get_int(qdict, "addr"); > + > + monitor_printf(mon, "Guest physical address %p is mapped at host virtual address %p\n", (void *)addr, cpu_physical_memory_map( (target_phys_addr_t)addr, (target_phys_addr_t *)&size, 0)); > This isn't quite right. It assumes TARGET_PAGE_SIZE is 4k which is certainly not always true. It also assumes that cpu_physical_memory_map() something that has some meaning which isn't necessarily the case. It could be a pointer to a bounce buffer. Could you give an end-to-end description of how you expect this mechanism to be used so we can work out a more appropriate set of interfaces. I assume this is MCE related. Regards, Anthony Liguori
On Wed, 2010-01-27 at 15:39 -0600, Anthony Liguori wrote: > On 01/26/2010 09:25 PM, Zheng, Jiajia wrote: > > Add command p2v to translate Guest physical address to Host virtual address. > > > > For what purpose? > > > Signed-off-by: Max Asbock<masbock@linux.vnet.ibm.com> > > Jiajia Zheng<jiajia.zheng@intel.com> > > --- > > diff --git a/monitor.c b/monitor.c > > index b33b01f..83d9ac7 100644 > > --- a/monitor.c > > +++ b/monitor.c > > @@ -668,6 +668,11 @@ static void do_info_uuid(Monitor *mon, QObject **ret_data) > > *ret_data = qobject_from_jsonf("{ 'UUID': %s }", uuid); > > } > > > > +static void do_info_p2v(Monitor *mon) > > +{ > > + monitor_printf(mon, "p2v implemented\n"); > > +} > > > > These should be implemented as QMP commands. > > > /* get the current CPU defined by the user */ > > static int mon_set_cpu(int cpu_index) > > { > > @@ -2283,6 +2288,14 @@ static void do_inject_mce(Monitor *mon, const QDict *qdict) > > break; > > } > > } > > +static void do_p2v(Monitor *mon, const QDict *qdict) > > +{ > > + target_long size = 4096; > > + target_long addr = qdict_get_int(qdict, "addr"); > > + > > + monitor_printf(mon, "Guest physical address %p is mapped at host virtual address %p\n", (void *)addr, cpu_physical_memory_map( (target_phys_addr_t)addr, (target_phys_addr_t *)&size, 0)); > > > > This isn't quite right. It assumes TARGET_PAGE_SIZE is 4k which is > certainly not always true. It also assumes that > cpu_physical_memory_map() something that has some meaning which isn't > necessarily the case. It could be a pointer to a bounce buffer. > > Could you give an end-to-end description of how you expect this > mechanism to be used so we can work out a more appropriate set of > interfaces. I assume this is MCE related. > The purpose of this is to translate a guest physical address to a host virtual address. This was indeed used for MCE testing. The p2v command provides one step in a chain of translations from guest virtual to guest physical to host virtual to host physical. Host physical is then used to inject a machine check error. As a consequence the HPOISON code on the host and the MCE injection code in qemu are exercised. I was always assuming that this implementation perhaps isn't the most optimal, but it simply worked for our test case. What would an appropriate method be to get a host virtual address for guest physical address that represents a page of RAM? thanks, Max
Hi, Any futher comments for this patch so that we can modify? thanks, jiajia Max Asbock wrote: > On Wed, 2010-01-27 at 15:39 -0600, Anthony Liguori wrote: >> On 01/26/2010 09:25 PM, Zheng, Jiajia wrote: >>> Add command p2v to translate Guest physical address to Host virtual >>> address. >>> >> >> For what purpose? >> >>> Signed-off-by: Max Asbock<masbock@linux.vnet.ibm.com> >>> Jiajia Zheng<jiajia.zheng@intel.com> >>> --- >>> diff --git a/monitor.c b/monitor.c >>> index b33b01f..83d9ac7 100644 >>> --- a/monitor.c >>> +++ b/monitor.c >>> @@ -668,6 +668,11 @@ static void do_info_uuid(Monitor *mon, QObject >>> **ret_data) *ret_data = qobject_from_jsonf("{ 'UUID': %s }", >>> uuid); } >>> >>> +static void do_info_p2v(Monitor *mon) >>> +{ >>> + monitor_printf(mon, "p2v implemented\n"); >>> +} >>> >> >> These should be implemented as QMP commands. >> >>> /* get the current CPU defined by the user */ >>> static int mon_set_cpu(int cpu_index) >>> { >>> @@ -2283,6 +2288,14 @@ static void do_inject_mce(Monitor *mon, >>> const QDict *qdict) break; } >>> } >>> +static void do_p2v(Monitor *mon, const QDict *qdict) +{ >>> + target_long size = 4096; >>> + target_long addr = qdict_get_int(qdict, "addr"); + >>> + monitor_printf(mon, "Guest physical address %p is mapped at >>> host virtual address %p\n", (void *)addr, cpu_physical_memory_map( >>> (target_phys_addr_t)addr, (target_phys_addr_t *)&size, 0)); >>> >> >> This isn't quite right. It assumes TARGET_PAGE_SIZE is 4k which is >> certainly not always true. It also assumes that >> cpu_physical_memory_map() something that has some meaning which isn't >> necessarily the case. It could be a pointer to a bounce buffer. >> >> Could you give an end-to-end description of how you expect this >> mechanism to be used so we can work out a more appropriate set of >> interfaces. I assume this is MCE related. >> > > The purpose of this is to translate a guest physical address to a host > virtual address. > This was indeed used for MCE testing. The p2v command provides one > step in a chain of translations from guest virtual to guest physical > to host virtual to host physical. Host physical is then used to > inject a machine check error. As a consequence the HPOISON code on > the host and the MCE injection code in qemu are exercised. > I was always assuming that this implementation perhaps isn't the most > optimal, but it simply worked for our test case. > > What would an appropriate method be to get a host virtual address for > guest physical address that represents a page of RAM? > > thanks, > Max
On 02/02/2010 10:04 PM, Zheng, Jiajia wrote: > Hi, > Any futher comments for this patch so that we can modify? > Unfortunately, I see no way to modify this feature so that it would be acceptable to upstream. I think you're going to have to carry this patch for your own debugging. The problem is, the approach you're taking to testing is extremely racy. Even just with the p2v command, the mapping within qemu that translates guest physical to host virtual is not stable over any given period of time. By the time the monitor prints out the virtual mapping of a host page, it's already potentially wrong. Likewise, the virtual->physical mapping within the host is also not stable so using that virtual address to lookup a physical address in the kernel is inherently racy. If you want to integrate MCE testing into qemu using the host to inject MCEs, you'll need to figure out a way to do it that allows the whole thing to be done without potentially killing the wrong process or injecting it to within the wrong address within qemu. I suspect that's going to be very difficult to achieve. Regards, Anthony Liguori
>If you want to integrate MCE testing into qemu using the host >to inject >MCEs, you'll need to figure out a way to do it that allows the whole >thing to be done without potentially killing the wrong process or In the test it's very simple. The process is kept running during the injection. With that its addresses do not change. So that problem is already solved. The feature doesn't aim to be a general feature for normal users, it's merely for QA. >I suspect that's going to be very difficult to achieve. Not at all; in a test suite that controls everything it's very simple. -Andi
On 02/03/2010 08:11 AM, Kleen, Andi wrote: > >> If you want to integrate MCE testing into qemu using the host >> to inject >> MCEs, you'll need to figure out a way to do it that allows the whole >> thing to be done without potentially killing the wrong process or >> > In the test it's very simple. The process is kept running during > the injection. With that its addresses do not change. > > So that problem is already solved. > > The feature doesn't aim to be a general feature for normal users, > it's merely for QA. > Yeah, but if we put a feature in qemu, we need to be able to support it for anyone who wants to use it. Adding something for a very particular test suite that won't work in normal circumstances is just asking for trouble IMHO. I still don't really understand all the pieces that are involved here. Why do we need a guest physical address? Are we testing reflecting MCEs from the host into a guest? Since that functionality isn't in qemu aren't we putting the cart before the horse here? Regards, Anthony Liguori
On Wed, 2010-02-03 at 08:23 -0600, Anthony Liguori wrote: > On 02/03/2010 08:11 AM, Kleen, Andi wrote: > > > >> If you want to integrate MCE testing into qemu using the host > >> to inject > >> MCEs, you'll need to figure out a way to do it that allows the whole > >> thing to be done without potentially killing the wrong process or > >> > > In the test it's very simple. The process is kept running during > > the injection. With that its addresses do not change. > > > > So that problem is already solved. > > > > The feature doesn't aim to be a general feature for normal users, > > it's merely for QA. > > > > Yeah, but if we put a feature in qemu, we need to be able to support it > for anyone who wants to use it. > > Adding something for a very particular test suite that won't work in > normal circumstances is just asking for trouble IMHO. Perhaps we can enable it at compile time, so in production builds that is disabled. It's not *that* much of a problem. The purpose for this patch is that we can integrate the whole test suite into kvm autotest, and we have ways to conveniently build qemu source code, passing special flags to configure and everything. So having this patch upstream at some point would make things easier, since the folks at intel won't have to keep maintaining that patch for every little change on qemu code, and we'll have better qa for qemu and kvm.
>Yeah, but if we put a feature in qemu, we need to be able to >support it >for anyone who wants to use it. It's useful for anyone who wants to use it for testing purposes. And it's useful to make sure the qemu/kernel/kvm machine check injection code works. >Adding something for a very particular test suite that won't work in >normal circumstances is just asking for trouble IMHO. RAS features generally need associated testing/injection hooks, otherwise they don't get tested regularly enough and bitrot. >I still don't really understand all the pieces that are >involved here. >Why do we need a guest physical address? Are we testing >reflecting MCEs >from the host into a guest? Since that functionality isn't in qemu >aren't we putting the cart before the horse here? qemu has support for triggering MCEs on the monitor. Also the KVM code base has support for forwarding the MCEs automatically. -Andi
On 02/03/2010 09:49 AM, Kleen, Andi wrote: > >> Yeah, but if we put a feature in qemu, we need to be able to >> support it >> for anyone who wants to use it. >> > It's useful for anyone who wants to use it for testing purposes. > And it's useful to make sure the qemu/kernel/kvm machine check > injection code works. > > >> Adding something for a very particular test suite that won't work in >> normal circumstances is just asking for trouble IMHO. >> > RAS features generally need associated testing/injection hooks, > otherwise they don't get tested regularly enough and bitrot. > > >> I still don't really understand all the pieces that are >> involved here. >> Why do we need a guest physical address? Are we testing >> reflecting MCEs >> > >from the host into a guest? Since that functionality isn't in qemu > >> aren't we putting the cart before the horse here? >> > qemu has support for triggering MCEs on the monitor. > > Also the KVM code base has support for forwarding the MCEs automatically. > KVM has all of the information you need (guest physical -> host physical mapping). It can also pin the mapping making it much safer to interface at that level. You should probably add an ioctl interface to KVM to get a host physical from a given guest physical and then use that to do the MCE injection. You would need to write a little helper tool and you would need a way to get an fd for an existing guest. Regards, Anthony Liguori But then it's not a user visible interface. > -Andi >
Hi, Anthony Anthony Liguori wrote: > On 02/03/2010 09:49 AM, Kleen, Andi wrote: >> >>> Yeah, but if we put a feature in qemu, we need to be able to >>> support it for anyone who wants to use it. >>> >> It's useful for anyone who wants to use it for testing purposes. >> And it's useful to make sure the qemu/kernel/kvm machine check >> injection code works. >> >> >>> Adding something for a very particular test suite that won't work in >>> normal circumstances is just asking for trouble IMHO. >>> >> RAS features generally need associated testing/injection hooks, >> otherwise they don't get tested regularly enough and bitrot. >> >> >>> I still don't really understand all the pieces that are >>> involved here. >>> Why do we need a guest physical address? Are we testing >>> reflecting MCEs >>> >>> from the host into a guest? Since that functionality isn't in qemu >> >>> aren't we putting the cart before the horse here? >>> >> qemu has support for triggering MCEs on the monitor. >> >> Also the KVM code base has support for forwarding the MCEs >> automatically. >> > > KVM has all of the information you need (guest physical -> host > physical mapping). It can also pin the mapping making it much safer > to interface at that level. You should probably add an ioctl > interface to KVM to get a host physical from a given guest physical > and then use that to do the MCE injection. You would need to write a > little helper tool and you would need a way to get an fd for an > existing guest. Thanks. Will rewrite the patch the way you suggested. bests, jiajia > Regards, > > Anthony Liguori > > But then it's not a user visible interface. > >> -Andi
On 02/03/2010 06:14 PM, Anthony Liguori wrote: >>> aren't we putting the cart before the horse here? >> qemu has support for triggering MCEs on the monitor. >> >> Also the KVM code base has support for forwarding the MCEs >> automatically. > > KVM has all of the information you need (guest physical -> host > physical mapping). It can also pin the mapping making it much safer > to interface at that level. You should probably add an ioctl > interface to KVM to get a host physical from a given guest physical > and then use that to do the MCE injection. You would need to write a > little helper tool and you would need a way to get an fd for an > existing guest. > It would be simpler to trigger the whole thing from within qemu.
On 02/07/2010 08:03 AM, Avi Kivity wrote: > On 02/03/2010 06:14 PM, Anthony Liguori wrote: >>>> aren't we putting the cart before the horse here? >>> qemu has support for triggering MCEs on the monitor. >>> >>> Also the KVM code base has support for forwarding the MCEs >>> automatically. >> >> KVM has all of the information you need (guest physical -> host >> physical mapping). It can also pin the mapping making it much safer >> to interface at that level. You should probably add an ioctl >> interface to KVM to get a host physical from a given guest physical >> and then use that to do the MCE injection. You would need to write a >> little helper tool and you would need a way to get an fd for an >> existing guest. >> > > It would be simpler to trigger the whole thing from within qemu. Only insofar as you don't have to deal with getting at the VM fd. You can avoid the problem by having the kvm ioctl interface take a pid or something. The problem I have with driving this through qemu is that it's purely a test mechanism and it's not even one that we can verify within qemu. It's a command that isn't really useful in anything but a very specific context. I don't think it's the right thing to do to add dozens of monitor commands to enable test harnesses that are not related to actual functionality within qemu. Regards, Anthony Liguori
On 02/07/2010 06:23 PM, Anthony Liguori wrote: > On 02/07/2010 08:03 AM, Avi Kivity wrote: >> On 02/03/2010 06:14 PM, Anthony Liguori wrote: >>>>> aren't we putting the cart before the horse here? >>>> qemu has support for triggering MCEs on the monitor. >>>> >>>> Also the KVM code base has support for forwarding the MCEs >>>> automatically. >>> >>> KVM has all of the information you need (guest physical -> host >>> physical mapping). It can also pin the mapping making it much safer >>> to interface at that level. You should probably add an ioctl >>> interface to KVM to get a host physical from a given guest physical >>> and then use that to do the MCE injection. You would need to write >>> a little helper tool and you would need a way to get an fd for an >>> existing guest. >>> >> >> It would be simpler to trigger the whole thing from within qemu. > > Only insofar as you don't have to deal with getting at the VM fd. You > can avoid the problem by having the kvm ioctl interface take a pid or > something. That's a racy interface. > > The problem I have with driving this through qemu is that it's purely > a test mechanism and it's not even one that we can verify within > qemu. It's a command that isn't really useful in anything but a very > specific context. > > I don't think it's the right thing to do to add dozens of monitor > commands to enable test harnesses that are not related to actual > functionality within qemu. Well, we need to provide a reasonable alternative. One might be to use -mempath (which is hacky by itself, but so far we have no alternative) and use an external tool on the memory object to poison it. An advantage is that you can use it independently of kvm.
On 02/07/2010 10:31 AM, Avi Kivity wrote: >> Only insofar as you don't have to deal with getting at the VM fd. >> You can avoid the problem by having the kvm ioctl interface take a >> pid or something. > > > That's a racy interface. The mechanism itself is racy. That said, pid's don't recycle very quickly so the chances of running into a practical issue is quite small. > Well, we need to provide a reasonable alternative. I think this is the sort of thing that really needs to be a utility that lives outside of qemu. I'm absolutely in favor of exposing enough internals to let people do interesting things provided it's reasonably correct. > One might be to use -mempath (which is hacky by itself, but so far we > have no alternative) and use an external tool on the memory object to > poison it. An advantage is that you can use it independently of kvm. It would help if the actual requirements were spelled out a bit more. What exactly needs validating? Do we need to validate that a poisoning a host physical address results in a very particular guest page getting poisoned? Is it not enough to just choose a random anonymous memory area within the qemu process, generate an MCE to that location, see whether qemu SIGBUS's. If it doesn't, validate that an MCE has been received in the guest? But FWIW, I think a set of per-VM directories in sysfs could be very useful for this sort of debugging. Maybe we should consider having the equivalent of a QMP-for-debugging session. This would be a special QMP session that we basically provided no compatibility or even sanity guarantees that was specifically there for debugging. I would expect that it be disabled in any production build (even perhaps even by default in the general build). Regards, Anthony Liguori
Anthony Liguori wrote: > On 02/07/2010 10:31 AM, Avi Kivity wrote: >>> Only insofar as you don't have to deal with getting at the VM fd. >>> You can avoid the problem by having the kvm ioctl interface take a >>> pid or something. >> >> >> That's a racy interface. > > The mechanism itself is racy. That said, pid's don't recycle very > quickly so the chances of running into a practical issue is quite > small. > >> Well, we need to provide a reasonable alternative. > > I think this is the sort of thing that really needs to be a utility > that lives outside of qemu. I'm absolutely in favor of exposing > enough internals to let people do interesting things provided it's > reasonably correct. > >> One might be to use -mempath (which is hacky by itself, but so far we >> have no alternative) and use an external tool on the memory object to >> poison it. An advantage is that you can use it independently of kvm. > > It would help if the actual requirements were spelled out a bit more. > What exactly needs validating? Do we need to validate that a > poisoning a host physical address results in a very particular guest > page getting poisoned? > > Is it not enough to just choose a random anonymous memory area within > the qemu process, generate an MCE to that location, see whether qemu > SIGBUS's. If it doesn't, validate that an MCE has been received in > the guest? Hi Anthony, I think we have to poinson a very particular guest page rather than a random one. We tried random poisoning before, which has potential issues. For random poisoning, it is possible that the poisoned page cannot be isolated in guest kernel in early kill time, hence some time later kvm tries to read/write the poisoned page, which triggered the late AR kill by which qume-kvm exits. For the above reason, a safe page should be selected inside kvm. So we do the following things in KVM RAS testing. - Start a process in the guest OS, get the virtual address from guest - Translate guest virtual address to guest physical address - Translate guest physical address to host virtual address - Translate host virtual address to host physical address - Software inject an AO MCE at that physical address bests, jiajia > > But FWIW, I think a set of per-VM directories in sysfs could be very > useful for this sort of debugging. > > Maybe we should consider having the equivalent of a QMP-for-debugging > session. This would be a special QMP session that we basically > provided no compatibility or even sanity guarantees that was > specifically there for debugging. I would expect that it be disabled > in any production build (even perhaps even by default in the general > build). > > Regards, > > Anthony Liguori
On 02/08/2010 12:09 AM, Anthony Liguori wrote: > On 02/07/2010 10:31 AM, Avi Kivity wrote: >>> Only insofar as you don't have to deal with getting at the VM fd. >>> You can avoid the problem by having the kvm ioctl interface take a >>> pid or something. >> >> >> That's a racy interface. > > The mechanism itself is racy. That said, pid's don't recycle very > quickly so the chances of running into a practical issue is quite small. While a low probability of a race is acceptable for a test tool, it isn't for a kernel interface. >> Well, we need to provide a reasonable alternative. > > I think this is the sort of thing that really needs to be a utility > that lives outside of qemu. I'm absolutely in favor of exposing > enough internals to let people do interesting things provided it's > reasonably correct. I agree that's desirable. However in light of the changable gpa->hva->hpa mappings, this may not be feasible. > >> One might be to use -mempath (which is hacky by itself, but so far we >> have no alternative) and use an external tool on the memory object to >> poison it. An advantage is that you can use it independently of kvm. > > It would help if the actual requirements were spelled out a bit more. > What exactly needs validating? Do we need to validate that a > poisoning a host physical address results in a very particular guest > page getting poisoned? > > Is it not enough to just choose a random anonymous memory area within > the qemu process, generate an MCE to that location, see whether qemu > SIGBUS's. If it doesn't, validate that an MCE has been received in > the guest? /proc/pid/pagemap may help, though that's racy too. If you pick the largest vma (or use -mempath) you're pretty much guaranteed to hit on the guest memory area. > > But FWIW, I think a set of per-VM directories in sysfs could be very > useful for this sort of debugging. > > Maybe we should consider having the equivalent of a QMP-for-debugging > session. This would be a special QMP session that we basically > provided no compatibility or even sanity guarantees that was > specifically there for debugging. I would expect that it be disabled > in any production build (even perhaps even by default in the general > build). > We have 'info cpus' that shows the vcpu->thread mappings, allowing management to pin cpus. Why not have 'info memory' that shows guest numa nodes and host virtual addresses? The migrate_pages() syscall takes a pid so it can be used by qemu's controller to load-balance a numa machine, and this can also be used by the poisoner to do its work.
diff --git a/monitor.c b/monitor.c index b33b01f..83d9ac7 100644 --- a/monitor.c +++ b/monitor.c @@ -668,6 +668,11 @@ static void do_info_uuid(Monitor *mon, QObject **ret_data) *ret_data = qobject_from_jsonf("{ 'UUID': %s }", uuid); } +static void do_info_p2v(Monitor *mon) +{ + monitor_printf(mon, "p2v implemented\n"); +} + /* get the current CPU defined by the user */ static int mon_set_cpu(int cpu_index) { @@ -2283,6 +2288,14 @@ static void do_inject_mce(Monitor *mon, const QDict *qdict) break; } } +static void do_p2v(Monitor *mon, const QDict *qdict) +{ + target_long size = 4096; + target_long addr = qdict_get_int(qdict, "addr"); + + monitor_printf(mon, "Guest physical address %p is mapped at host virtual address %p\n", (void *)addr, cpu_physical_memory_map( (target_phys_addr_t)addr, (target_phys_addr_t *)&size, 0)); +} + #endif static void do_getfd(Monitor *mon, const QDict *qdict, QObject **ret_data) @@ -2659,6 +2672,13 @@ static const mon_cmd_t info_cmds[] = { .mhandler.info = do_info_qdm, }, { + .name = "p2v", + .args_type = "", + .params = "", + .help = "translate guest physical to host virtual address", + .mhandler.info = do_info_p2v, + }, + { .name = "roms", .args_type = "", .params = "", diff --git a/qemu-monitor.hx b/qemu-monitor.hx index 9e3ea3c..ab9743c 100644 --- a/qemu-monitor.hx +++ b/qemu-monitor.hx @@ -326,6 +326,16 @@ Start gdbserver session (default @var{port}=1234) ETEXI { + .name = "p2v", + .args_type = "fmt:/,addr:l", + .params = "/fmt addr", + .help = "translate guest physical 'addr' to host virtual address", + .mhandler.cmd = do_p2v, + }, +STEXI +ETEXI + + { .name = "x", .args_type = "fmt:/,addr:l", .params = "/fmt addr",
Add command p2v to translate Guest physical address to Host virtual address. Signed-off-by: Max Asbock <masbock@linux.vnet.ibm.com> Jiajia Zheng <jiajia.zheng@intel.com> ---