diff mbox

latest rc: virtio-blk hangs forever after migration

Message ID 20140716115229.GA7741@amt.cnet
State New
Headers show

Commit Message

Marcelo Tosatti July 16, 2014, 11:52 a.m. UTC
On Wed, Jul 16, 2014 at 12:38:51PM +0400, Andrey Korolyov wrote:
> On Wed, Jul 16, 2014 at 5:16 AM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> > On Wed, Jul 16, 2014 at 03:40:47AM +0400, Andrey Korolyov wrote:
> >> On Wed, Jul 16, 2014 at 2:01 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> >> > Il 15/07/2014 23:25, Andrey Korolyov ha scritto:
> >> >
> >> >> On Wed, Jul 16, 2014 at 1:09 AM, Marcelo Tosatti <mtosatti@redhat.com>
> >> >> wrote:
> >> >>>
> >> >>> On Tue, Jul 15, 2014 at 06:01:08PM +0400, Andrey Korolyov wrote:
> >> >>>>
> >> >>>> On Tue, Jul 15, 2014 at 10:52 AM, Andrey Korolyov <andrey@xdel.ru>
> >> >>>> wrote:
> >> >>>>>
> >> >>>>> On Tue, Jul 15, 2014 at 9:03 AM, Amit Shah <amit.shah@redhat.com>
> >> >>>>> wrote:
> >> >>>>>>
> >> >>>>>> On (Sun) 13 Jul 2014 [16:28:56], Andrey Korolyov wrote:
> >> >>>>>>>
> >> >>>>>>> Hello,
> >> >>>>>>>
> >> >>>>>>> the issue is not specific to the iothread code because generic
> >> >>>>>>> virtio-blk also hangs up:
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> Do you know which version works well?  If you could bisect, that'll
> >> >>>>>> help a lot.
> >> >>>>>>
> >> >>>>>> Thanks,
> >> >>>>>>                 Amit
> >> >>>>>
> >> >>>>>
> >> >>>>> Hi,
> >> >>>>>
> >> >>>>> 2.0 works definitely well. I`ll try to finish bisection today, though
> >> >>>>> every step takes about 10 minutes to complete.
> >> >>>>
> >> >>>>
> >> >>>> Yay! It is even outside of virtio-blk.
> >> >>>>
> >> >>>> commit 9b1786829aefb83f37a8f3135e3ea91c56001b56
> >> >>>> Author: Marcelo Tosatti <mtosatti@redhat.com>
> >> >>>> Date:   Tue Jun 3 13:34:48 2014 -0300
> >> >>>>
> >> >>>>     kvmclock: Ensure proper env->tsc value for kvmclock_current_nsec
> >> >>>> calculation
> >> >>>>
> >> >>>>     Ensure proper env->tsc value for kvmclock_current_nsec calculation.
> >> >>>>
> >> >>>>     Reported-by: Marcin Gibuła <m.gibula@beyond.pl>
> >> >>>>     Cc: qemu-stable@nongnu.org
> >> >>>>     Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> >> >>>>     Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> >> >>>
> >> >>>
> >> >>> Andrey,
> >> >>>
> >> >>> Can you please provide instructions on how to create reproducible
> >> >>> environment?
> >> >>>
> >> >>> The following patch is equivalent to the original patch, for
> >> >>> the purposes of fixing the kvmclock problem.
> >> >>>
> >> >>> Perhaps it becomes easier to spot the reason for the hang you are
> >> >>> experiencing.
> >> >>>
> >> >>>
> >> >>> diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c
> >> >>> index 272a88a..feb5fc5 100644
> >> >>> --- a/hw/i386/kvm/clock.c
> >> >>> +++ b/hw/i386/kvm/clock.c
> >> >>> @@ -17,7 +17,6 @@
> >> >>>  #include "qemu/host-utils.h"
> >> >>>  #include "sysemu/sysemu.h"
> >> >>>  #include "sysemu/kvm.h"
> >> >>> -#include "sysemu/cpus.h"
> >> >>>  #include "hw/sysbus.h"
> >> >>>  #include "hw/kvm/clock.h"
> >> >>>
> >> >>> @@ -66,7 +65,6 @@ static uint64_t kvmclock_current_nsec(KVMClockState *s)
> >> >>>
> >> >>>      cpu_physical_memory_read(kvmclock_struct_pa, &time, sizeof(time));
> >> >>>
> >> >>> -    assert(time.tsc_timestamp <= migration_tsc);
> >> >>>      delta = migration_tsc - time.tsc_timestamp;
> >> >>>      if (time.tsc_shift < 0) {
> >> >>>          delta >>= -time.tsc_shift;
> >> >>> @@ -125,8 +123,6 @@ static void kvmclock_vm_state_change(void *opaque,
> >> >>> int running,
> >> >>>          if (s->clock_valid) {
> >> >>>              return;
> >> >>>          }
> >> >>> -
> >> >>> -        cpu_synchronize_all_states();
> >> >>>          ret = kvm_vm_ioctl(kvm_state, KVM_GET_CLOCK, &data);
> >> >>>          if (ret < 0) {
> >> >>>              fprintf(stderr, "KVM_GET_CLOCK failed: %s\n",
> >> >>> strerror(ret));
> >> >>> diff --git a/migration.c b/migration.c
> >> >>> index 8d675b3..34f2325 100644
> >> >>> --- a/migration.c
> >> >>> +++ b/migration.c
> >> >>> @@ -608,6 +608,7 @@ static void *migration_thread(void *opaque)
> >> >>>                  qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER);
> >> >>>                  old_vm_running = runstate_is_running();
> >> >>>
> >> >>> +                cpu_synchronize_all_states();
> >> >>>                  ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
> >> >>>                  if (ret >= 0) {
> >> >>>                      qemu_file_set_rate_limit(s->file, INT64_MAX);
> >> >
> >> >
> >> > It could also be useful to apply the above patch _and_ revert
> >> > a096b3a6732f846ec57dc28b47ee9435aa0609bf, then try to reproduce.
> >> >
> >> > Paolo
> >>
> >> Yes, it solved the issue for me! (it took much time to check because
> >> most of country` debian mirrors went inconsistent by some reason)
> >>
> >> Also trivial addition:
> >>
> >> diff --git a/migration.c b/migration.c
> >> index 34f2325..65d1c88 100644
> >> --- a/migration.c
> >> +++ b/migration.c
> >> @@ -25,6 +25,7 @@
> >>  #include "qemu/thread.h"
> >>  #include "qmp-commands.h"
> >>  #include "trace.h"
> >> +#include "sysemu/cpus.h"
> >
> > And what about not reverting a096b3a6732f846ec57dc28b47ee9435aa0609bf ?
> >
> > That is, test with a stock qemu.git tree and the patch sent today,
> > on this thread, to move cpu_synchronize_all_states ?
> >
> >
> 
> The main reason for things to work for me is a revert of
> 9b1786829aefb83f37a8f3135e3ea91c56001b56 on top, not adding any other
> patches. I had tested two cases, with Alexander`s patch completely
> reverted plus suggestion from Marcelo and only with revert 9b178682
> plug same suggestion. The difference is that the until Alexander`
> patch is not reverted, live migration is always failing by the timeout
> value, and when reverted migration always succeeds in 8-10 seconds.
> Appropriate diffs are attached for the reference.

Andrey,

Can you please apply only the following attached patch to an upstream
QEMU git tree (move_synchronize_all_states.patch), plus the necessary
header file corrections, and attempt to reproduce?

When you reproduce, please provide a backtrace and version of the QEMU
git tree, and instructions on how to reproduce:

1) full QEMU command line
2) steps to reproduce
Move cpu_synchronize_all_states to migration.c.

Comments

Andrey Korolyov July 16, 2014, 1:24 p.m. UTC | #1
On Wed, Jul 16, 2014 at 3:52 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> On Wed, Jul 16, 2014 at 12:38:51PM +0400, Andrey Korolyov wrote:
>> On Wed, Jul 16, 2014 at 5:16 AM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
>> > On Wed, Jul 16, 2014 at 03:40:47AM +0400, Andrey Korolyov wrote:
>> >> On Wed, Jul 16, 2014 at 2:01 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> >> > Il 15/07/2014 23:25, Andrey Korolyov ha scritto:
>> >> >
>> >> >> On Wed, Jul 16, 2014 at 1:09 AM, Marcelo Tosatti <mtosatti@redhat.com>
>> >> >> wrote:
>> >> >>>
>> >> >>> On Tue, Jul 15, 2014 at 06:01:08PM +0400, Andrey Korolyov wrote:
>> >> >>>>
>> >> >>>> On Tue, Jul 15, 2014 at 10:52 AM, Andrey Korolyov <andrey@xdel.ru>
>> >> >>>> wrote:
>> >> >>>>>
>> >> >>>>> On Tue, Jul 15, 2014 at 9:03 AM, Amit Shah <amit.shah@redhat.com>
>> >> >>>>> wrote:
>> >> >>>>>>
>> >> >>>>>> On (Sun) 13 Jul 2014 [16:28:56], Andrey Korolyov wrote:
>> >> >>>>>>>
>> >> >>>>>>> Hello,
>> >> >>>>>>>
>> >> >>>>>>> the issue is not specific to the iothread code because generic
>> >> >>>>>>> virtio-blk also hangs up:
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>> Do you know which version works well?  If you could bisect, that'll
>> >> >>>>>> help a lot.
>> >> >>>>>>
>> >> >>>>>> Thanks,
>> >> >>>>>>                 Amit
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> Hi,
>> >> >>>>>
>> >> >>>>> 2.0 works definitely well. I`ll try to finish bisection today, though
>> >> >>>>> every step takes about 10 minutes to complete.
>> >> >>>>
>> >> >>>>
>> >> >>>> Yay! It is even outside of virtio-blk.
>> >> >>>>
>> >> >>>> commit 9b1786829aefb83f37a8f3135e3ea91c56001b56
>> >> >>>> Author: Marcelo Tosatti <mtosatti@redhat.com>
>> >> >>>> Date:   Tue Jun 3 13:34:48 2014 -0300
>> >> >>>>
>> >> >>>>     kvmclock: Ensure proper env->tsc value for kvmclock_current_nsec
>> >> >>>> calculation
>> >> >>>>
>> >> >>>>     Ensure proper env->tsc value for kvmclock_current_nsec calculation.
>> >> >>>>
>> >> >>>>     Reported-by: Marcin Gibuła <m.gibula@beyond.pl>
>> >> >>>>     Cc: qemu-stable@nongnu.org
>> >> >>>>     Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
>> >> >>>>     Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> >> >>>
>> >> >>>
>> >> >>> Andrey,
>> >> >>>
>> >> >>> Can you please provide instructions on how to create reproducible
>> >> >>> environment?
>> >> >>>
>> >> >>> The following patch is equivalent to the original patch, for
>> >> >>> the purposes of fixing the kvmclock problem.
>> >> >>>
>> >> >>> Perhaps it becomes easier to spot the reason for the hang you are
>> >> >>> experiencing.
>> >> >>>
>> >> >>>
>> >> >>> diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c
>> >> >>> index 272a88a..feb5fc5 100644
>> >> >>> --- a/hw/i386/kvm/clock.c
>> >> >>> +++ b/hw/i386/kvm/clock.c
>> >> >>> @@ -17,7 +17,6 @@
>> >> >>>  #include "qemu/host-utils.h"
>> >> >>>  #include "sysemu/sysemu.h"
>> >> >>>  #include "sysemu/kvm.h"
>> >> >>> -#include "sysemu/cpus.h"
>> >> >>>  #include "hw/sysbus.h"
>> >> >>>  #include "hw/kvm/clock.h"
>> >> >>>
>> >> >>> @@ -66,7 +65,6 @@ static uint64_t kvmclock_current_nsec(KVMClockState *s)
>> >> >>>
>> >> >>>      cpu_physical_memory_read(kvmclock_struct_pa, &time, sizeof(time));
>> >> >>>
>> >> >>> -    assert(time.tsc_timestamp <= migration_tsc);
>> >> >>>      delta = migration_tsc - time.tsc_timestamp;
>> >> >>>      if (time.tsc_shift < 0) {
>> >> >>>          delta >>= -time.tsc_shift;
>> >> >>> @@ -125,8 +123,6 @@ static void kvmclock_vm_state_change(void *opaque,
>> >> >>> int running,
>> >> >>>          if (s->clock_valid) {
>> >> >>>              return;
>> >> >>>          }
>> >> >>> -
>> >> >>> -        cpu_synchronize_all_states();
>> >> >>>          ret = kvm_vm_ioctl(kvm_state, KVM_GET_CLOCK, &data);
>> >> >>>          if (ret < 0) {
>> >> >>>              fprintf(stderr, "KVM_GET_CLOCK failed: %s\n",
>> >> >>> strerror(ret));
>> >> >>> diff --git a/migration.c b/migration.c
>> >> >>> index 8d675b3..34f2325 100644
>> >> >>> --- a/migration.c
>> >> >>> +++ b/migration.c
>> >> >>> @@ -608,6 +608,7 @@ static void *migration_thread(void *opaque)
>> >> >>>                  qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER);
>> >> >>>                  old_vm_running = runstate_is_running();
>> >> >>>
>> >> >>> +                cpu_synchronize_all_states();
>> >> >>>                  ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
>> >> >>>                  if (ret >= 0) {
>> >> >>>                      qemu_file_set_rate_limit(s->file, INT64_MAX);
>> >> >
>> >> >
>> >> > It could also be useful to apply the above patch _and_ revert
>> >> > a096b3a6732f846ec57dc28b47ee9435aa0609bf, then try to reproduce.
>> >> >
>> >> > Paolo
>> >>
>> >> Yes, it solved the issue for me! (it took much time to check because
>> >> most of country` debian mirrors went inconsistent by some reason)
>> >>
>> >> Also trivial addition:
>> >>
>> >> diff --git a/migration.c b/migration.c
>> >> index 34f2325..65d1c88 100644
>> >> --- a/migration.c
>> >> +++ b/migration.c
>> >> @@ -25,6 +25,7 @@
>> >>  #include "qemu/thread.h"
>> >>  #include "qmp-commands.h"
>> >>  #include "trace.h"
>> >> +#include "sysemu/cpus.h"
>> >
>> > And what about not reverting a096b3a6732f846ec57dc28b47ee9435aa0609bf ?
>> >
>> > That is, test with a stock qemu.git tree and the patch sent today,
>> > on this thread, to move cpu_synchronize_all_states ?
>> >
>> >
>>
>> The main reason for things to work for me is a revert of
>> 9b1786829aefb83f37a8f3135e3ea91c56001b56 on top, not adding any other
>> patches. I had tested two cases, with Alexander`s patch completely
>> reverted plus suggestion from Marcelo and only with revert 9b178682
>> plug same suggestion. The difference is that the until Alexander`
>> patch is not reverted, live migration is always failing by the timeout
>> value, and when reverted migration always succeeds in 8-10 seconds.
>> Appropriate diffs are attached for the reference.
>
> Andrey,
>
> Can you please apply only the following attached patch to an upstream
> QEMU git tree (move_synchronize_all_states.patch), plus the necessary
> header file corrections, and attempt to reproduce?
>
> When you reproduce, please provide a backtrace and version of the QEMU
> git tree, and instructions on how to reproduce:
>
> 1) full QEMU command line
> 2) steps to reproduce
>
>

Marcelo, as I can see, this patch resembles second case from my
previous message exactly (there are diffs from the generic rc). I/O is
not locking up there but live migration failing and libvirt moves a
freezed state. I can try to run the same on top of rc2, but it`ll be
probably the same.
Andrey Korolyov July 16, 2014, 6:25 p.m. UTC | #2
On Wed, Jul 16, 2014 at 5:24 PM, Andrey Korolyov <andrey@xdel.ru> wrote:
> On Wed, Jul 16, 2014 at 3:52 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
>> On Wed, Jul 16, 2014 at 12:38:51PM +0400, Andrey Korolyov wrote:
>>> On Wed, Jul 16, 2014 at 5:16 AM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
>>> > On Wed, Jul 16, 2014 at 03:40:47AM +0400, Andrey Korolyov wrote:
>>> >> On Wed, Jul 16, 2014 at 2:01 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>> >> > Il 15/07/2014 23:25, Andrey Korolyov ha scritto:
>>> >> >
>>> >> >> On Wed, Jul 16, 2014 at 1:09 AM, Marcelo Tosatti <mtosatti@redhat.com>
>>> >> >> wrote:
>>> >> >>>
>>> >> >>> On Tue, Jul 15, 2014 at 06:01:08PM +0400, Andrey Korolyov wrote:
>>> >> >>>>
>>> >> >>>> On Tue, Jul 15, 2014 at 10:52 AM, Andrey Korolyov <andrey@xdel.ru>
>>> >> >>>> wrote:
>>> >> >>>>>
>>> >> >>>>> On Tue, Jul 15, 2014 at 9:03 AM, Amit Shah <amit.shah@redhat.com>
>>> >> >>>>> wrote:
>>> >> >>>>>>
>>> >> >>>>>> On (Sun) 13 Jul 2014 [16:28:56], Andrey Korolyov wrote:
>>> >> >>>>>>>
>>> >> >>>>>>> Hello,
>>> >> >>>>>>>
>>> >> >>>>>>> the issue is not specific to the iothread code because generic
>>> >> >>>>>>> virtio-blk also hangs up:
>>> >> >>>>>>
>>> >> >>>>>>
>>> >> >>>>>> Do you know which version works well?  If you could bisect, that'll
>>> >> >>>>>> help a lot.
>>> >> >>>>>>
>>> >> >>>>>> Thanks,
>>> >> >>>>>>                 Amit
>>> >> >>>>>
>>> >> >>>>>
>>> >> >>>>> Hi,
>>> >> >>>>>
>>> >> >>>>> 2.0 works definitely well. I`ll try to finish bisection today, though
>>> >> >>>>> every step takes about 10 minutes to complete.
>>> >> >>>>
>>> >> >>>>
>>> >> >>>> Yay! It is even outside of virtio-blk.
>>> >> >>>>
>>> >> >>>> commit 9b1786829aefb83f37a8f3135e3ea91c56001b56
>>> >> >>>> Author: Marcelo Tosatti <mtosatti@redhat.com>
>>> >> >>>> Date:   Tue Jun 3 13:34:48 2014 -0300
>>> >> >>>>
>>> >> >>>>     kvmclock: Ensure proper env->tsc value for kvmclock_current_nsec
>>> >> >>>> calculation
>>> >> >>>>
>>> >> >>>>     Ensure proper env->tsc value for kvmclock_current_nsec calculation.
>>> >> >>>>
>>> >> >>>>     Reported-by: Marcin Gibuła <m.gibula@beyond.pl>
>>> >> >>>>     Cc: qemu-stable@nongnu.org
>>> >> >>>>     Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
>>> >> >>>>     Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>> >> >>>
>>> >> >>>
>>> >> >>> Andrey,
>>> >> >>>
>>> >> >>> Can you please provide instructions on how to create reproducible
>>> >> >>> environment?
>>> >> >>>
>>> >> >>> The following patch is equivalent to the original patch, for
>>> >> >>> the purposes of fixing the kvmclock problem.
>>> >> >>>
>>> >> >>> Perhaps it becomes easier to spot the reason for the hang you are
>>> >> >>> experiencing.
>>> >> >>>
>>> >> >>>
>>> >> >>> diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c
>>> >> >>> index 272a88a..feb5fc5 100644
>>> >> >>> --- a/hw/i386/kvm/clock.c
>>> >> >>> +++ b/hw/i386/kvm/clock.c
>>> >> >>> @@ -17,7 +17,6 @@
>>> >> >>>  #include "qemu/host-utils.h"
>>> >> >>>  #include "sysemu/sysemu.h"
>>> >> >>>  #include "sysemu/kvm.h"
>>> >> >>> -#include "sysemu/cpus.h"
>>> >> >>>  #include "hw/sysbus.h"
>>> >> >>>  #include "hw/kvm/clock.h"
>>> >> >>>
>>> >> >>> @@ -66,7 +65,6 @@ static uint64_t kvmclock_current_nsec(KVMClockState *s)
>>> >> >>>
>>> >> >>>      cpu_physical_memory_read(kvmclock_struct_pa, &time, sizeof(time));
>>> >> >>>
>>> >> >>> -    assert(time.tsc_timestamp <= migration_tsc);
>>> >> >>>      delta = migration_tsc - time.tsc_timestamp;
>>> >> >>>      if (time.tsc_shift < 0) {
>>> >> >>>          delta >>= -time.tsc_shift;
>>> >> >>> @@ -125,8 +123,6 @@ static void kvmclock_vm_state_change(void *opaque,
>>> >> >>> int running,
>>> >> >>>          if (s->clock_valid) {
>>> >> >>>              return;
>>> >> >>>          }
>>> >> >>> -
>>> >> >>> -        cpu_synchronize_all_states();
>>> >> >>>          ret = kvm_vm_ioctl(kvm_state, KVM_GET_CLOCK, &data);
>>> >> >>>          if (ret < 0) {
>>> >> >>>              fprintf(stderr, "KVM_GET_CLOCK failed: %s\n",
>>> >> >>> strerror(ret));
>>> >> >>> diff --git a/migration.c b/migration.c
>>> >> >>> index 8d675b3..34f2325 100644
>>> >> >>> --- a/migration.c
>>> >> >>> +++ b/migration.c
>>> >> >>> @@ -608,6 +608,7 @@ static void *migration_thread(void *opaque)
>>> >> >>>                  qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER);
>>> >> >>>                  old_vm_running = runstate_is_running();
>>> >> >>>
>>> >> >>> +                cpu_synchronize_all_states();
>>> >> >>>                  ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
>>> >> >>>                  if (ret >= 0) {
>>> >> >>>                      qemu_file_set_rate_limit(s->file, INT64_MAX);
>>> >> >
>>> >> >
>>> >> > It could also be useful to apply the above patch _and_ revert
>>> >> > a096b3a6732f846ec57dc28b47ee9435aa0609bf, then try to reproduce.
>>> >> >
>>> >> > Paolo
>>> >>
>>> >> Yes, it solved the issue for me! (it took much time to check because
>>> >> most of country` debian mirrors went inconsistent by some reason)
>>> >>
>>> >> Also trivial addition:
>>> >>
>>> >> diff --git a/migration.c b/migration.c
>>> >> index 34f2325..65d1c88 100644
>>> >> --- a/migration.c
>>> >> +++ b/migration.c
>>> >> @@ -25,6 +25,7 @@
>>> >>  #include "qemu/thread.h"
>>> >>  #include "qmp-commands.h"
>>> >>  #include "trace.h"
>>> >> +#include "sysemu/cpus.h"
>>> >
>>> > And what about not reverting a096b3a6732f846ec57dc28b47ee9435aa0609bf ?
>>> >
>>> > That is, test with a stock qemu.git tree and the patch sent today,
>>> > on this thread, to move cpu_synchronize_all_states ?
>>> >
>>> >
>>>
>>> The main reason for things to work for me is a revert of
>>> 9b1786829aefb83f37a8f3135e3ea91c56001b56 on top, not adding any other
>>> patches. I had tested two cases, with Alexander`s patch completely
>>> reverted plus suggestion from Marcelo and only with revert 9b178682
>>> plug same suggestion. The difference is that the until Alexander`
>>> patch is not reverted, live migration is always failing by the timeout
>>> value, and when reverted migration always succeeds in 8-10 seconds.
>>> Appropriate diffs are attached for the reference.
>>
>> Andrey,
>>
>> Can you please apply only the following attached patch to an upstream
>> QEMU git tree (move_synchronize_all_states.patch), plus the necessary
>> header file corrections, and attempt to reproduce?
>>
>> When you reproduce, please provide a backtrace and version of the QEMU
>> git tree, and instructions on how to reproduce:
>>
>> 1) full QEMU command line
>> 2) steps to reproduce
>>
>>
>
> Marcelo, as I can see, this patch resembles second case from my
> previous message exactly (there are diffs from the generic rc). I/O is
> not locking up there but live migration failing and libvirt moves a
> freezed state. I can try to run the same on top of rc2, but it`ll be
> probably the same.


Tested on iscsi pool, though there are no-cache requirement and rbd
with disabled cache may survive one migration but iscsi backend hangs
always. As it was before, just rolling back problematic commit fixes
the problem and adding cpu_synchronize_all_states to migration.c has
no difference at a glance in a VM` behavior. The problem consist at
least two separate ones: the current hang and behavior with the
unreverted patch from agraf - last one causes live migration with
writeback cache to fail, cache=none works well in any variant which
survives first condition. Marcin, would you mind to check the current
state of the problem on your environments in a spare time? It is
probably easier to reproduce on iscsi because of way smaller time
needed to set it up, command line and libvirt config attached
(v2.1.0-rc2 plus iscsi-1.11.0).
qemu-system-x86_64 -enable-kvm -name vm27842 -S -machine pc-i440fx-2.1,accel=kvm,usb=off -m 256 -realtime mlock=off -smp 12,sockets=1,cores=12,threads=12 -numa node,nodeid=0,cpus=0,mem=256 -uuid 9a44bdae-5702-463b-aa1e-d8d85055f6af -nographic -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/vm27842.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,clock=vm,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/dev/disk/by-path/ip-10.6.0.1:3260-iscsi-iqn.2014-05.ru.flops:kvmguest-lun-1,if=none,id=drive-virtio-disk0,format=raw,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x2,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=24 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:10:03:40,bus=pci.0,addr=0x3 -netdev tap,fd=25,id=hostnet1,vhost=on,vhostfd=26 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:10:03:41,bus=pci.0,addr=0x4 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/vm27842.sock,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.1
[random]
rw=randrw
size=768M
directory=/data
iodepth=32
fsync=1
direct=1
blocksize=2M
numjobs=1
nrfiles=4
group_reporting
ioengine=libaio
loops=512
Marcin Gibuła July 16, 2014, 9:28 p.m. UTC | #3
> Tested on iscsi pool, though there are no-cache requirement and rbd
> with disabled cache may survive one migration but iscsi backend hangs
> always. As it was before, just rolling back problematic commit fixes
> the problem and adding cpu_synchronize_all_states to migration.c has
> no difference at a glance in a VM` behavior. The problem consist at
> least two separate ones: the current hang and behavior with the
> unreverted patch from agraf - last one causes live migration with
> writeback cache to fail, cache=none works well in any variant which
> survives first condition. Marcin, would you mind to check the current
> state of the problem on your environments in a spare time? It is
> probably easier to reproduce on iscsi because of way smaller time
> needed to set it up, command line and libvirt config attached
> (v2.1.0-rc2 plus iscsi-1.11.0).

Ok, but what exacly do you want me to test?

Just to avoid any confusion, originally there were two problems with 
kvmclock:

1. Commit a096b3a6732f846ec57dc28b47ee9435aa0609bf fixes problem when 
clock drift (?) caused kvmclock in guest to report time in past which 
caused guest kernel to hang. This is hard to reproduce reliably 
(probably as it requires long time for drift to accumulate).

2. Commit 9b1786829aefb83f37a8f3135e3ea91c56001b56 fixes regression 
caused by a096b3a6732f846ec57dc28b47ee9435aa0609bf which occured during 
non-migration operations (drive-mirror + pivot), which also caused guest 
kernel to hang. This is trival to reproduce.

I'm using both of them applied on top of 2.0 in production and have no 
problems with them. I'm using NFS exclusively with cache=none.

So, I shall test vm-migration and drive-migration with 2.1.0-rc2 with no 
extra patches applied or reverted, on VM that is running fio, am I correct?
Andrey Korolyov July 16, 2014, 9:36 p.m. UTC | #4
On Thu, Jul 17, 2014 at 1:28 AM, Marcin Gibuła <m.gibula@beyond.pl> wrote:
>> Tested on iscsi pool, though there are no-cache requirement and rbd
>> with disabled cache may survive one migration but iscsi backend hangs
>> always. As it was before, just rolling back problematic commit fixes
>> the problem and adding cpu_synchronize_all_states to migration.c has
>> no difference at a glance in a VM` behavior. The problem consist at
>> least two separate ones: the current hang and behavior with the
>> unreverted patch from agraf - last one causes live migration with
>> writeback cache to fail, cache=none works well in any variant which
>> survives first condition. Marcin, would you mind to check the current
>> state of the problem on your environments in a spare time? It is
>> probably easier to reproduce on iscsi because of way smaller time
>> needed to set it up, command line and libvirt config attached
>> (v2.1.0-rc2 plus iscsi-1.11.0).
>
>
> Ok, but what exacly do you want me to test?
>
> Just to avoid any confusion, originally there were two problems with
> kvmclock:
>
> 1. Commit a096b3a6732f846ec57dc28b47ee9435aa0609bf fixes problem when clock
> drift (?) caused kvmclock in guest to report time in past which caused guest
> kernel to hang. This is hard to reproduce reliably (probably as it requires
> long time for drift to accumulate).
>
> 2. Commit 9b1786829aefb83f37a8f3135e3ea91c56001b56 fixes regression caused
> by a096b3a6732f846ec57dc28b47ee9435aa0609bf which occured during
> non-migration operations (drive-mirror + pivot), which also caused guest
> kernel to hang. This is trival to reproduce.
>
> I'm using both of them applied on top of 2.0 in production and have no
> problems with them. I'm using NFS exclusively with cache=none.
>
> So, I shall test vm-migration and drive-migration with 2.1.0-rc2 with no
> extra patches applied or reverted, on VM that is running fio, am I correct?
>

Yes, exactly. ISCSI-based setup can take some minutes to deploy, given
prepared image, and I have one hundred percent hit rate for the
original issue with it.

> --
> mg
Marcin Gibuła July 17, 2014, 9:49 a.m. UTC | #5
>> I'm using both of them applied on top of 2.0 in production and have no
>> problems with them. I'm using NFS exclusively with cache=none.
>>
>> So, I shall test vm-migration and drive-migration with 2.1.0-rc2 with no
>> extra patches applied or reverted, on VM that is running fio, am I correct?
>>
>
> Yes, exactly. ISCSI-based setup can take some minutes to deploy, given
> prepared image, and I have one hundred percent hit rate for the
> original issue with it.

I've reproduced your IO hang with 2.0 and both 
9b1786829aefb83f37a8f3135e3ea91c56001b56 and 
a096b3a6732f846ec57dc28b47ee9435aa0609bf applied.

Reverting 9b1786829aefb83f37a8f3135e3ea91c56001b56 indeed fixes the 
problem (but reintroduces block-migration hang). It's seems like qemu 
bug rather than guest problem, as no-kvmclock parameters makes no 
difference. IO just stops, all qemu IO threads die off. Almost like it 
forgets to migrate them:-)

I'm attaching backtrace from guest kernel and qemu and qemu command line.

Going to compile 2.1-rc.
Marcin Gibuła July 17, 2014, 11:20 a.m. UTC | #6
> I've reproduced your IO hang with 2.0 and both
> 9b1786829aefb83f37a8f3135e3ea91c56001b56 and
> a096b3a6732f846ec57dc28b47ee9435aa0609bf applied.
>
> Reverting 9b1786829aefb83f37a8f3135e3ea91c56001b56 indeed fixes the
> problem (but reintroduces block-migration hang). It's seems like qemu
> bug rather than guest problem, as no-kvmclock parameters makes no
> difference. IO just stops, all qemu IO threads die off. Almost like it
> forgets to migrate them:-)

Some more info:

a) 2.0 + 9b1786829aefb83f37a8f3135e3ea91c56001b56 + 
a096b3a6732f846ec57dc28b47ee9435aa0609bf = hangs

b) 2.0 + 9b1786829aefb83f37a8f3135e3ea91c56001b56 = works

c) 2.0 + 9b1786829aefb83f37a8f3135e3ea91c56001b56 + move 
cpu_synchronize_state to migration.c = works

Tested with NFS (qcow2) + cache=none.

IO is dead only for disk that was being written to during migration.
I.e. if my test VM has two disks: vda and vdb, and I'm running fio on 
vdb and it hangs after migration, I can still issue writes to vda.

Recreation steps:
1. Create VM
2. Run fio (Andrey's config)
3. Live migrate VM couple of times.
Marcin Gibuła July 17, 2014, 11:54 a.m. UTC | #7
>> Yes, exactly. ISCSI-based setup can take some minutes to deploy, given
>> prepared image, and I have one hundred percent hit rate for the
>> original issue with it.
>
> I've reproduced your IO hang with 2.0 and both
> 9b1786829aefb83f37a8f3135e3ea91c56001b56 and
> a096b3a6732f846ec57dc28b47ee9435aa0609bf applied.
>
> Reverting 9b1786829aefb83f37a8f3135e3ea91c56001b56 indeed fixes the
> problem (but reintroduces block-migration hang). It's seems like qemu
> bug rather than guest problem, as no-kvmclock parameters makes no
> difference. IO just stops, all qemu IO threads die off. Almost like it
> forgets to migrate them:-)
>
> I'm attaching backtrace from guest kernel and qemu and qemu command line.
>
> Going to compile 2.1-rc.

2.1-rc2 behaves exactly the same.

Interestingly enough, reseting guest system causes I/O to work again. So 
it's not qemu that hangs on IO, rather it fails to notify guest about 
completed operations that were issued during migration.

And its somehow caused by calling cpu_synchronize_all_states() inside 
kvmclock_vm_state_change().



As for testing with cache=writeback, I'll try to setup some iscsi to 
test it.
Andrey Korolyov July 17, 2014, 12:06 p.m. UTC | #8
On Thu, Jul 17, 2014 at 3:54 PM, Marcin Gibuła <m.gibula@beyond.pl> wrote:
>>> Yes, exactly. ISCSI-based setup can take some minutes to deploy, given
>>> prepared image, and I have one hundred percent hit rate for the
>>> original issue with it.
>>
>>
>> I've reproduced your IO hang with 2.0 and both
>> 9b1786829aefb83f37a8f3135e3ea91c56001b56 and
>> a096b3a6732f846ec57dc28b47ee9435aa0609bf applied.
>>
>> Reverting 9b1786829aefb83f37a8f3135e3ea91c56001b56 indeed fixes the
>> problem (but reintroduces block-migration hang). It's seems like qemu
>> bug rather than guest problem, as no-kvmclock parameters makes no
>> difference. IO just stops, all qemu IO threads die off. Almost like it
>> forgets to migrate them:-)
>>
>> I'm attaching backtrace from guest kernel and qemu and qemu command line.
>>
>> Going to compile 2.1-rc.
>
>
> 2.1-rc2 behaves exactly the same.
>
> Interestingly enough, reseting guest system causes I/O to work again. So
> it's not qemu that hangs on IO, rather it fails to notify guest about
> completed operations that were issued during migration.
>
> And its somehow caused by calling cpu_synchronize_all_states() inside
> kvmclock_vm_state_change().
>
>
>
> As for testing with cache=writeback, I'll try to setup some iscsi to test
> it.

Awesome, thanks! AFAIK you`ll not be able to use write cache with
iscsi for migration. VM which had a reset before hangs always when
freshly launched have a chance to be migrated successfully. And yes,
it looks like lower layer forgetting to notify driver about some
operations at a glance.

>
> --
> mg
diff mbox

Patch

diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c
index 272a88a..feb5fc5 100644
--- a/hw/i386/kvm/clock.c
+++ b/hw/i386/kvm/clock.c
@@ -17,7 +17,6 @@ 
 #include "qemu/host-utils.h"
 #include "sysemu/sysemu.h"
 #include "sysemu/kvm.h"
-#include "sysemu/cpus.h"
 #include "hw/sysbus.h"
 #include "hw/kvm/clock.h"
 
@@ -66,7 +65,6 @@  static uint64_t kvmclock_current_nsec(KVMClockState *s)
 
     cpu_physical_memory_read(kvmclock_struct_pa, &time, sizeof(time));
 
-    assert(time.tsc_timestamp <= migration_tsc);
     delta = migration_tsc - time.tsc_timestamp;
     if (time.tsc_shift < 0) {
         delta >>= -time.tsc_shift;
@@ -125,8 +123,6 @@  static void kvmclock_vm_state_change(void *opaque, int running,
         if (s->clock_valid) {
             return;
         }
-
-        cpu_synchronize_all_states();
         ret = kvm_vm_ioctl(kvm_state, KVM_GET_CLOCK, &data);
         if (ret < 0) {
             fprintf(stderr, "KVM_GET_CLOCK failed: %s\n", strerror(ret));
diff --git a/migration.c b/migration.c
index 8d675b3..34f2325 100644
--- a/migration.c
+++ b/migration.c
@@ -608,6 +608,7 @@  static void *migration_thread(void *opaque)
                 qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER);
                 old_vm_running = runstate_is_running();
 
+                cpu_synchronize_all_states();
                 ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
                 if (ret >= 0) {
                     qemu_file_set_rate_limit(s->file, INT64_MAX);