Message ID | 372238c800e0d57815f472502fdf78e53463bbb6.1265232579.git.jan.kiszka@siemens.com |
---|---|
State | New |
Headers | show |
On Wed, Feb 03, 2010 at 10:29:45PM +0100, Jan Kiszka wrote: > So far we synchronized any dirty VCPU state back into the kernel before > updating the guest debug state. This was a tribute to a deficit in x86 > kernels before 2.6.33. But as this is an arch-dependent issue, it is > better handle in the x86 part of KVM and remove the writeback point for > generic code. Jan, This patch breaks migration.
Marcelo Tosatti wrote: > On Wed, Feb 03, 2010 at 10:29:45PM +0100, Jan Kiszka wrote: >> So far we synchronized any dirty VCPU state back into the kernel before >> updating the guest debug state. This was a tribute to a deficit in x86 >> kernels before 2.6.33. But as this is an arch-dependent issue, it is >> better handle in the x86 part of KVM and remove the writeback point for >> generic code. > > Jan, > > This patch breaks migration. Can you elaborate what you did? I can't reproduce, and I do not see any conceptual issue (given that guest debugging conflicts with migration anyway). Jan
On Thu, Feb 04, 2010 at 01:33:50AM +0100, Jan Kiszka wrote: > Marcelo Tosatti wrote: > > On Wed, Feb 03, 2010 at 10:29:45PM +0100, Jan Kiszka wrote: > >> So far we synchronized any dirty VCPU state back into the kernel before > >> updating the guest debug state. This was a tribute to a deficit in x86 > >> kernels before 2.6.33. But as this is an arch-dependent issue, it is > >> better handle in the x86 part of KVM and remove the writeback point for > >> generic code. > > > > Jan, > > > > This patch breaks migration. > > Can you elaborate what you did? I can't reproduce, and I do not see any > conceptual issue (given that guest debugging conflicts with migration > anyway). kvm-autotest fails (migration only, install is ok, both Linux and Win guests). Not sure why, perhaps the unconditional KVM_SET_GUEST_DEBUG corrupts state somehow? Tested with io thread enabled.
Marcelo Tosatti wrote: > On Thu, Feb 04, 2010 at 01:33:50AM +0100, Jan Kiszka wrote: >> Marcelo Tosatti wrote: >>> On Wed, Feb 03, 2010 at 10:29:45PM +0100, Jan Kiszka wrote: >>>> So far we synchronized any dirty VCPU state back into the kernel before >>>> updating the guest debug state. This was a tribute to a deficit in x86 >>>> kernels before 2.6.33. But as this is an arch-dependent issue, it is >>>> better handle in the x86 part of KVM and remove the writeback point for >>>> generic code. >>> Jan, >>> >>> This patch breaks migration. >> Can you elaborate what you did? I can't reproduce, and I do not see any >> conceptual issue (given that guest debugging conflicts with migration >> anyway). > > kvm-autotest fails (migration only, install is ok, both Linux and Win > guests). Not sure why, perhaps the unconditional KVM_SET_GUEST_DEBUG > corrupts state somehow? > > Tested with io thread enabled. That's this default-off thing, so... OK, confirmed, investigating. Jan
Jan Kiszka wrote: > Marcelo Tosatti wrote: >> On Thu, Feb 04, 2010 at 01:33:50AM +0100, Jan Kiszka wrote: >>> Marcelo Tosatti wrote: >>>> On Wed, Feb 03, 2010 at 10:29:45PM +0100, Jan Kiszka wrote: >>>>> So far we synchronized any dirty VCPU state back into the kernel before >>>>> updating the guest debug state. This was a tribute to a deficit in x86 >>>>> kernels before 2.6.33. But as this is an arch-dependent issue, it is >>>>> better handle in the x86 part of KVM and remove the writeback point for >>>>> generic code. >>>> Jan, >>>> >>>> This patch breaks migration. >>> Can you elaborate what you did? I can't reproduce, and I do not see any >>> conceptual issue (given that guest debugging conflicts with migration >>> anyway). >> kvm-autotest fails (migration only, install is ok, both Linux and Win >> guests). Not sure why, perhaps the unconditional KVM_SET_GUEST_DEBUG >> corrupts state somehow? >> >> Tested with io thread enabled. > > That's this default-off thing, so... OK, confirmed, investigating. > Heisenbug: It first also popped up (in form of a frozen migration target) after removing this patch, but now it's totally unreproducible, whatever patch I apply or revert from my series. Base is current master. I tend to think there is a hidden issue of iothread vs. migration, unrelated to this patch. Jan
On Thu, Feb 04, 2010 at 04:41:44PM +0100, Jan Kiszka wrote: > Jan Kiszka wrote: > > Marcelo Tosatti wrote: > >> On Thu, Feb 04, 2010 at 01:33:50AM +0100, Jan Kiszka wrote: > >>> Marcelo Tosatti wrote: > >>>> On Wed, Feb 03, 2010 at 10:29:45PM +0100, Jan Kiszka wrote: > >>>>> So far we synchronized any dirty VCPU state back into the kernel before > >>>>> updating the guest debug state. This was a tribute to a deficit in x86 > >>>>> kernels before 2.6.33. But as this is an arch-dependent issue, it is > >>>>> better handle in the x86 part of KVM and remove the writeback point for > >>>>> generic code. > >>>> Jan, > >>>> > >>>> This patch breaks migration. > >>> Can you elaborate what you did? I can't reproduce, and I do not see any > >>> conceptual issue (given that guest debugging conflicts with migration > >>> anyway). > >> kvm-autotest fails (migration only, install is ok, both Linux and Win > >> guests). Not sure why, perhaps the unconditional KVM_SET_GUEST_DEBUG > >> corrupts state somehow? > >> > >> Tested with io thread enabled. > > > > That's this default-off thing, so... OK, confirmed, investigating. > > > > Heisenbug: It first also popped up (in form of a frozen migration > target) after removing this patch, but now it's totally unreproducible, > whatever patch I apply or revert from my series. Base is current master. > > I tend to think there is a hidden issue of iothread vs. migration, > unrelated to this patch. Probably many :) Do you have c5f32c99c6855d466737daf1cd262e7e92062f87 (from qemu-kvm.git uq/master) in? With kvm-autotest the failure is not sporadic (and the above commit applied): with KVM_SET_GUEST_DEBUG in arch_put_regs all migration tests fail, without, all of them succeed. So env->kvm_guest_debug has been zeroed by cpu_x86_init, which means the writeback via KVM_SET_GUEST_DEBUG does almost nothing. It does get_rflags and set_rflags in the kernel. Test box is off, but the synchronous writeback via qemu_system_reset in main, after machine and vcpu thread initialization, might be problematic. But it would be nice to understand this. Unrelated to this problem, won't put_vcpu_events, which is executed after KVM_SET_GUEST_DEBUG, overwrite any queued debug exceptions?
Marcelo Tosatti wrote: > On Thu, Feb 04, 2010 at 04:41:44PM +0100, Jan Kiszka wrote: >> Jan Kiszka wrote: >>> Marcelo Tosatti wrote: >>>> On Thu, Feb 04, 2010 at 01:33:50AM +0100, Jan Kiszka wrote: >>>>> Marcelo Tosatti wrote: >>>>>> On Wed, Feb 03, 2010 at 10:29:45PM +0100, Jan Kiszka wrote: >>>>>>> So far we synchronized any dirty VCPU state back into the kernel before >>>>>>> updating the guest debug state. This was a tribute to a deficit in x86 >>>>>>> kernels before 2.6.33. But as this is an arch-dependent issue, it is >>>>>>> better handle in the x86 part of KVM and remove the writeback point for >>>>>>> generic code. >>>>>> Jan, >>>>>> >>>>>> This patch breaks migration. >>>>> Can you elaborate what you did? I can't reproduce, and I do not see any >>>>> conceptual issue (given that guest debugging conflicts with migration >>>>> anyway). >>>> kvm-autotest fails (migration only, install is ok, both Linux and Win >>>> guests). Not sure why, perhaps the unconditional KVM_SET_GUEST_DEBUG >>>> corrupts state somehow? >>>> >>>> Tested with io thread enabled. >>> That's this default-off thing, so... OK, confirmed, investigating. >>> >> Heisenbug: It first also popped up (in form of a frozen migration >> target) after removing this patch, but now it's totally unreproducible, >> whatever patch I apply or revert from my series. Base is current master. >> >> I tend to think there is a hidden issue of iothread vs. migration, >> unrelated to this patch. > > Probably many :) > > Do you have c5f32c99c6855d466737daf1cd262e7e92062f87 (from qemu-kvm.git > uq/master) in? Yes. And that might have been the reason why some early tests failed when it was no yet applied here. > > With kvm-autotest the failure is not sporadic (and the above commit > applied): with KVM_SET_GUEST_DEBUG in arch_put_regs all migration > tests fail, without, all of them succeed. > > So env->kvm_guest_debug has been zeroed by cpu_x86_init, which means > the writeback via KVM_SET_GUEST_DEBUG does almost nothing. It does > get_rflags and set_rflags in the kernel. Hmm, it also copies debug regs around... BTW, where do we save/restore dr0..7 between kernel and user space? But that should not be a problem, both shadow as well as effective regs should be properly initialized, specifically for a newly created VCPU. > > Test box is off, but the synchronous writeback via qemu_system_reset > in main, after machine and vcpu thread initialization, might be > problematic. But it would be nice to understand this. > > Unrelated to this problem, won't put_vcpu_events, which is executed > after KVM_SET_GUEST_DEBUG, overwrite any queued debug exceptions? Good point, SET_GUEST_DEBUG should be last in the writeback for that reason. Jan
Jan Kiszka wrote: > Marcelo Tosatti wrote: >> >> Unrelated to this problem, won't put_vcpu_events, which is executed >> after KVM_SET_GUEST_DEBUG, overwrite any queued debug exceptions? > > Good point, SET_GUEST_DEBUG should be last in the writeback for that reason. Actually, we no longer need the exception injection via SET_GUEST_DEBUG now that we have full access via vcpu_events. So this needs a cleanup, and I'm afraid quite a few cases are broken ATM with vcpu_events writeback overwriting the reinjected exceptions. Jan
Jan Kiszka wrote: > Marcelo Tosatti wrote: >> With kvm-autotest the failure is not sporadic (and the above commit >> applied): with KVM_SET_GUEST_DEBUG in arch_put_regs all migration >> tests fail, without, all of them succeed. >> >> So env->kvm_guest_debug has been zeroed by cpu_x86_init, which means >> the writeback via KVM_SET_GUEST_DEBUG does almost nothing. It does >> get_rflags and set_rflags in the kernel. > > Hmm, it also copies debug regs around... BTW, where do we save/restore > dr0..7 between kernel and user space? > > But that should not be a problem, both shadow as well as effective regs > should be properly initialized, specifically for a newly created VCPU. Could you retry after pushing SET_GUEST_DEBUG at the end of kvm_arch_put_registers? Maybe it is no good idea to run get/set_rflags without having the sregs properly initialized. Jan
On Thu, Feb 04, 2010 at 08:21:08PM +0100, Jan Kiszka wrote: > Jan Kiszka wrote: > > Marcelo Tosatti wrote: > >> With kvm-autotest the failure is not sporadic (and the above commit > >> applied): with KVM_SET_GUEST_DEBUG in arch_put_regs all migration > >> tests fail, without, all of them succeed. > >> > >> So env->kvm_guest_debug has been zeroed by cpu_x86_init, which means > >> the writeback via KVM_SET_GUEST_DEBUG does almost nothing. It does > >> get_rflags and set_rflags in the kernel. > > > > Hmm, it also copies debug regs around... BTW, where do we save/restore > > dr0..7 between kernel and user space? They're not. > > But that should not be a problem, both shadow as well as effective regs > > should be properly initialized, specifically for a newly created VCPU. Yep. > Could you retry after pushing SET_GUEST_DEBUG at the end of > kvm_arch_put_registers? Maybe it is no good idea to run get/set_rflags > without having the sregs properly initialized. Will do next week. Another tricky thing with this is that the definition of whats the kernel job and whats userspace job is somewhat blurry in points. For example set_regs clears pending exceptions, which made sense in the past, but breaks now if userspace does put_vcpu_events before set_regs (which is not the case with current userspace but just an example). Makes sense to heavily document things as suggested.
On Thu, Feb 04, 2010 at 08:21:08PM +0100, Jan Kiszka wrote: > Jan Kiszka wrote: > > Marcelo Tosatti wrote: > >> With kvm-autotest the failure is not sporadic (and the above commit > >> applied): with KVM_SET_GUEST_DEBUG in arch_put_regs all migration > >> tests fail, without, all of them succeed. > >> > >> So env->kvm_guest_debug has been zeroed by cpu_x86_init, which means > >> the writeback via KVM_SET_GUEST_DEBUG does almost nothing. It does > >> get_rflags and set_rflags in the kernel. > > > > Hmm, it also copies debug regs around... BTW, where do we save/restore > > dr0..7 between kernel and user space? > > > > But that should not be a problem, both shadow as well as effective regs > > should be properly initialized, specifically for a newly created VCPU. > > Could you retry after pushing SET_GUEST_DEBUG at the end of > kvm_arch_put_registers? Maybe it is no good idea to run get/set_rflags > without having the sregs properly initialized. > > Jan Can't reproduce (with the original patch, KVM_SET_GUEST_DEBUG at beginning of arch_put_regs). Different test box though. Go figure. Anyway, as you mentioned, the workaround can be made cleaner through set_vcpu_events.
On Thu, Feb 04, 2010 at 08:00:26PM +0100, Jan Kiszka wrote: > Jan Kiszka wrote: > > Marcelo Tosatti wrote: > >> > >> Unrelated to this problem, won't put_vcpu_events, which is executed > >> after KVM_SET_GUEST_DEBUG, overwrite any queued debug exceptions? > > > > Good point, SET_GUEST_DEBUG should be last in the writeback for that reason. > > Actually, we no longer need the exception injection via SET_GUEST_DEBUG > now that we have full access via vcpu_events. > So this needs a cleanup, and I'm afraid quite a few cases are broken > ATM with vcpu_events writeback overwriting the reinjected exceptions. Don't see what you mean here. Can you be more explicit? What breakage?
Marcelo Tosatti wrote: > On Thu, Feb 04, 2010 at 08:00:26PM +0100, Jan Kiszka wrote: >> Jan Kiszka wrote: >>> Marcelo Tosatti wrote: >>>> Unrelated to this problem, won't put_vcpu_events, which is executed >>>> after KVM_SET_GUEST_DEBUG, overwrite any queued debug exceptions? >>> Good point, SET_GUEST_DEBUG should be last in the writeback for that reason. >> Actually, we no longer need the exception injection via SET_GUEST_DEBUG >> now that we have full access via vcpu_events. > >> So this needs a cleanup, and I'm afraid quite a few cases are broken >> ATM with vcpu_events writeback overwriting the reinjected exceptions. > > Don't see what you mean here. Can you be more explicit? What breakage? SET_GUEST_DEBUG and SET_VCPU_EVENTS are mutually exclusive. If we have the latter, don't use the former for exception injection anymore. And this is broken already without any of my patches applied, that was my point. Will work on this soon. Jan
diff --git a/kvm-all.c b/kvm-all.c index 67b44b5..f3e09c8 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -938,10 +938,6 @@ static void kvm_invoke_set_guest_debug(void *data) struct kvm_set_guest_debug_data *dbg_data = data; CPUState *env = dbg_data->env; - if (env->kvm_vcpu_dirty) { - kvm_arch_put_registers(env); - env->kvm_vcpu_dirty = 0; - } dbg_data->err = kvm_vcpu_ioctl(env, KVM_SET_GUEST_DEBUG, &dbg_data->dbg); } @@ -949,12 +945,12 @@ int kvm_update_guest_debug(CPUState *env, unsigned long reinject_trap) { struct kvm_set_guest_debug_data data; - data.dbg.control = 0; - if (env->singlestep_enabled) - data.dbg.control = KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_SINGLESTEP; + data.dbg.control = reinject_trap; + if (env->singlestep_enabled) { + data.dbg.control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_SINGLESTEP; + } kvm_arch_update_guest_debug(env, &data.dbg); - data.dbg.control |= reinject_trap; data.env = env; on_vcpu(env, kvm_invoke_set_guest_debug, &data); diff --git a/target-i386/cpu.h b/target-i386/cpu.h index 216b00e..dfc4a26 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -21,6 +21,10 @@ #include "config.h" +#ifdef CONFIG_KVM +#include <linux/kvm.h> /* for kvm_guest_debug */ +#endif + #ifdef TARGET_X86_64 #define TARGET_LONG_BITS 64 #else @@ -702,7 +706,10 @@ typedef struct CPUX86State { uint8_t has_error_code; uint32_t sipi_vector; uint32_t cpuid_kvm_features; - +#if defined(CONFIG_KVM) && defined(KVM_CAP_SET_GUEST_DEBUG) + struct kvm_guest_debug kvm_guest_debug; +#endif + /* in order to simplify APIC support, we leave this pointer to the user */ struct APICState *apic_state; diff --git a/target-i386/kvm.c b/target-i386/kvm.c index f690b2e..2b3d8f6 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -851,6 +851,15 @@ int kvm_arch_put_registers(CPUState *env) if (ret < 0) return ret; + /* + * Kernels before 2.6.33 overwrote flags.TF injected via SET_GUEST_DEBUG + * while updating GP regs. Work around this by updating the debug state + * once again. + */ + ret = kvm_vcpu_ioctl(env, KVM_SET_GUEST_DEBUG, &env->kvm_guest_debug); + if (ret < 0) + return ret; + ret = kvm_put_fpu(env); if (ret < 0) return ret; @@ -1147,5 +1156,7 @@ void kvm_arch_update_guest_debug(CPUState *env, struct kvm_guest_debug *dbg) (len_code[hw_breakpoint[n].len] << (18 + n*4)); } } + /* Keep a copy for the writeback workaround in kvm_arch_put_registers */ + memcpy(&env->kvm_guest_debug, dbg, sizeof(env->kvm_guest_debug)); } #endif /* KVM_CAP_SET_GUEST_DEBUG */
So far we synchronized any dirty VCPU state back into the kernel before updating the guest debug state. This was a tribute to a deficit in x86 kernels before 2.6.33. But as this is an arch-dependent issue, it is better handle in the x86 part of KVM and remove the writeback point for generic code. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> --- kvm-all.c | 12 ++++-------- target-i386/cpu.h | 9 ++++++++- target-i386/kvm.c | 11 +++++++++++ 3 files changed, 23 insertions(+), 9 deletions(-)