Patchwork Weird thing happen when the VM is stop! (0.12.3)

login
register
mail settings
Submitter Jun Koi
Date April 9, 2010, 4:12 p.m.
Message ID <r2sfdaac4d51004090912r53449ccj569223bdbee22556@mail.gmail.com>
Download mbox | patch
Permalink /patch/49853/
State New
Headers show

Comments

Jun Koi - April 9, 2010, 4:12 p.m.
On Fri, Apr 9, 2010 at 10:20 PM, Luiz Capitulino <lcapitulino@redhat.com> wrote:
> On Fri, 9 Apr 2010 18:32:21 +0900
> Jun Koi <junkoi2004@gmail.com> wrote:
>
>> Hi,
>>
>> I found something very funny happening with 0.12.3: it seems the VM is
>> still running even I already stopped it.
>>
>> Here is how I verified that: Boot any OS (I checked with Windows XP
>> and Ubuntu) with 0.12.3, and stop it any time after it booted up. Use
>> "stop" command on monitor interface.
>>
>> Now the VM stops. Then in the same monitor interface, run "info
>> registers" again and again. You can see that the value of EIP and
>> EFLAGS still change once in a while. This should not happen, becaues
>> the VM already stopped.
>>
>> I checked, and dont see this problem with 0.11.1. And this doesnt
>> happen with the latest code in the git tree, either.
>>
>> Any idea on why this happens???
>
>  Can you try commit 55274a305 ? If it fixes the problem we need it
> in stable, if it doesn't you can try to find the fix by using git bisect.
>

This hint makes sense, but the point is that I tried with some commits
before that 55274a305, and didnt see the problem. Still I am not sure
the problem is already fixed before 55274a305, or I am just unlucky
enough not see the problem when testing.

After bisecting, I can say that the culprit is the below patch, from Marcelo.

Now I am wondering if the above commit 55274a305 of Paolo Bonzini
fixed the bug, or other commit before that?? We should find the
correct fix, and port it to 0.12.4.

Thanks,
J



commit 535d2eb34a0f1908dc694c51ce8d4ec6dccc7807
Author: Marcelo Tosatti <mtosatti@redhat.com>
Date:   Tue Feb 9 12:49:04 2010 -0200

    iothread: fix vcpu stop with smp tcg

    Round robin vcpus in tcg_cpu_next even if the vm stopped. This
    allows all cpus to enter stopped state.

    Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
    Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
    (cherry picked from commit c37cc7b072fa4ca8d8d21ac31d26baff5f47f9f9)
Marcelo Tosatti - April 9, 2010, 6:09 p.m.
On Sat, Apr 10, 2010 at 01:12:27AM +0900, Jun Koi wrote:
> On Fri, Apr 9, 2010 at 10:20 PM, Luiz Capitulino <lcapitulino@redhat.com> wrote:
> > On Fri, 9 Apr 2010 18:32:21 +0900
> > Jun Koi <junkoi2004@gmail.com> wrote:
> >
> >> Hi,
> >>
> >> I found something very funny happening with 0.12.3: it seems the VM is
> >> still running even I already stopped it.
> >>
> >> Here is how I verified that: Boot any OS (I checked with Windows XP
> >> and Ubuntu) with 0.12.3, and stop it any time after it booted up. Use
> >> "stop" command on monitor interface.
> >>
> >> Now the VM stops. Then in the same monitor interface, run "info
> >> registers" again and again. You can see that the value of EIP and
> >> EFLAGS still change once in a while. This should not happen, becaues
> >> the VM already stopped.
> >>
> >> I checked, and dont see this problem with 0.11.1. And this doesnt
> >> happen with the latest code in the git tree, either.
> >>
> >> Any idea on why this happens???
> >
> >  Can you try commit 55274a305 ? If it fixes the problem we need it
> > in stable, if it doesn't you can try to find the fix by using git bisect.
> >
> 
> This hint makes sense, but the point is that I tried with some commits
> before that 55274a305, and didnt see the problem. Still I am not sure
> the problem is already fixed before 55274a305, or I am just unlucky
> enough not see the problem when testing.
> 
> After bisecting, I can say that the culprit is the below patch, from Marcelo.
> 
> Now I am wondering if the above commit 55274a305 of Paolo Bonzini
> fixed the bug, or other commit before that?? We should find the
> correct fix, and port it to 0.12.4.

Guess its c5f32c99. Can you confirm please?
Jun Koi - April 12, 2010, 6:22 a.m.
On Sat, Apr 10, 2010 at 3:09 AM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> On Sat, Apr 10, 2010 at 01:12:27AM +0900, Jun Koi wrote:
>> On Fri, Apr 9, 2010 at 10:20 PM, Luiz Capitulino <lcapitulino@redhat.com> wrote:
>> > On Fri, 9 Apr 2010 18:32:21 +0900
>> > Jun Koi <junkoi2004@gmail.com> wrote:
>> >
>> >> Hi,
>> >>
>> >> I found something very funny happening with 0.12.3: it seems the VM is
>> >> still running even I already stopped it.
>> >>
>> >> Here is how I verified that: Boot any OS (I checked with Windows XP
>> >> and Ubuntu) with 0.12.3, and stop it any time after it booted up. Use
>> >> "stop" command on monitor interface.
>> >>
>> >> Now the VM stops. Then in the same monitor interface, run "info
>> >> registers" again and again. You can see that the value of EIP and
>> >> EFLAGS still change once in a while. This should not happen, becaues
>> >> the VM already stopped.
>> >>
>> >> I checked, and dont see this problem with 0.11.1. And this doesnt
>> >> happen with the latest code in the git tree, either.
>> >>
>> >> Any idea on why this happens???
>> >
>> >  Can you try commit 55274a305 ? If it fixes the problem we need it
>> > in stable, if it doesn't you can try to find the fix by using git bisect.
>> >
>>
>> This hint makes sense, but the point is that I tried with some commits
>> before that 55274a305, and didnt see the problem. Still I am not sure
>> the problem is already fixed before 55274a305, or I am just unlucky
>> enough not see the problem when testing.
>>
>> After bisecting, I can say that the culprit is the below patch, from Marcelo.
>>
>> Now I am wondering if the above commit 55274a305 of Paolo Bonzini
>> fixed the bug, or other commit before that?? We should find the
>> correct fix, and port it to 0.12.4.
>
> Guess its c5f32c99. Can you confirm please?

I back-ported this patch to 0.12.3, and it seems the problem is gone.

Thanks,
J

Patch

diff --git a/vl.c b/vl.c
index 007709a..3b5a8e0 100644
--- a/vl.c
+++ b/vl.c
@@ -4042,14 +4042,15 @@  static void tcg_cpu_exec(void)
     for (; next_cpu != NULL; next_cpu = next_cpu->next_cpu) {
         CPUState *env = cur_cpu = next_cpu;

-        if (!vm_running)
-            break;
         if (timer_alarm_pending) {
             timer_alarm_pending = 0;
             break;
         }
         if (cpu_can_run(env))
             ret = qemu_cpu_exec(env);
+        else if (env->stop)
+            break;
+
         if (ret == EXCP_DEBUG) {
             gdb_set_stop_cpu(env);
             debug_requested = 1;