Patchwork Re: MIPS, io-thread, icount and wfi

login
register
mail settings
Submitter Edgar Iglesias
Date Jan. 18, 2011, 12:19 a.m.
Message ID <20110118001950.GA11802@laped.lan>
Download mbox | patch
Permalink /patch/79237/
State New
Headers show

Comments

Edgar Iglesias - Jan. 18, 2011, 12:19 a.m.
On Mon, Jan 17, 2011 at 11:03:08AM +0100, Edgar E. Iglesias wrote:
> Hi,
> 
> I'm running an io-thread enabled qemu-system-mipsel with icount.
> When the guest (linux) goes to sleep through the wait insn (waiting
> to be woken up by future timer interrupts), the thing deadlocks.
> 
> IIUC, this is because vm timers are driven by icount, but the CPU is
> halted so icount makes no progress and time stands still.
> 
> I've locally disabled vcpu halting when icount is enabled, that
> works around my problem but of course makes qemu consume 100% host cpu.
> 
> I don't know why I only see this problem with io-thread builds?
> Could be related timing and luck.
> 
> Would be interesting to know if someone has any info on how this was
> intended to work (if it was)? And if there are ideas for better
> workarounds or fixes that don't disable vcpu halting entirely.

Hi,

I've found the problem. For some reason io-thread builds use a
static timeout for wait loops. The entire chunk of code that
makes sure qemu_icount makes forward progress when the CPU's
are idle has been ifdef'ed away...

This fixes the problem for me, hopefully without affecting
io-thread runs without icount.

commit 0f4f3a919952500b487b438c5520f07a1c6be35b
Author: Edgar E. Iglesias <edgar@axis.com>
Date:   Tue Jan 18 01:01:57 2011 +0100

    qemu-timer: Fix timeout calc for io-thread with icount
    
    Make sure we always make forward progress with qemu_icount to
    avoid deadlocks. For io-thread, use the static 1000 timeout
    only if icount is disabled.
    
    Signed-off-by: Edgar E. Iglesias <edgar@axis.com>
Jan Kiszka - Jan. 18, 2011, 10 a.m.
On 2011-01-18 01:19, Edgar E. Iglesias wrote:
> On Mon, Jan 17, 2011 at 11:03:08AM +0100, Edgar E. Iglesias wrote:
>> Hi,
>>
>> I'm running an io-thread enabled qemu-system-mipsel with icount.
>> When the guest (linux) goes to sleep through the wait insn (waiting
>> to be woken up by future timer interrupts), the thing deadlocks.
>>
>> IIUC, this is because vm timers are driven by icount, but the CPU is
>> halted so icount makes no progress and time stands still.
>>
>> I've locally disabled vcpu halting when icount is enabled, that
>> works around my problem but of course makes qemu consume 100% host cpu.
>>
>> I don't know why I only see this problem with io-thread builds?
>> Could be related timing and luck.
>>
>> Would be interesting to know if someone has any info on how this was
>> intended to work (if it was)? And if there are ideas for better
>> workarounds or fixes that don't disable vcpu halting entirely.
> 
> Hi,
> 
> I've found the problem. For some reason io-thread builds use a
> static timeout for wait loops. The entire chunk of code that
> makes sure qemu_icount makes forward progress when the CPU's
> are idle has been ifdef'ed away...
> 
> This fixes the problem for me, hopefully without affecting
> io-thread runs without icount.
> 
> commit 0f4f3a919952500b487b438c5520f07a1c6be35b
> Author: Edgar E. Iglesias <edgar@axis.com>
> Date:   Tue Jan 18 01:01:57 2011 +0100
> 
>     qemu-timer: Fix timeout calc for io-thread with icount
>     
>     Make sure we always make forward progress with qemu_icount to
>     avoid deadlocks. For io-thread, use the static 1000 timeout
>     only if icount is disabled.
>     
>     Signed-off-by: Edgar E. Iglesias <edgar@axis.com>
> 
> diff --git a/qemu-timer.c b/qemu-timer.c
> index 95814af..db1ec49 100644
> --- a/qemu-timer.c
> +++ b/qemu-timer.c
> @@ -110,7 +110,6 @@ static int64_t cpu_get_clock(void)
>      }
>  }
>  
> -#ifndef CONFIG_IOTHREAD
>  static int64_t qemu_icount_delta(void)
>  {
>      if (!use_icount) {
> @@ -124,7 +123,6 @@ static int64_t qemu_icount_delta(void)
>          return cpu_get_icount() - cpu_get_clock();
>      }
>  }
> -#endif
>  
>  /* enable cpu_get_ticks() */
>  void cpu_enable_ticks(void)
> @@ -1077,9 +1075,17 @@ void quit_timers(void)
>  
>  int qemu_calculate_timeout(void)
>  {
> -#ifndef CONFIG_IOTHREAD
>      int timeout;
>  
> +#ifdef CONFIG_IOTHREAD
> +    /* When using icount, making forward progress with qemu_icount when the
> +       guest CPU is idle is critical. We only use the static io-thread timeout
> +       for non icount runs.  */
> +    if (!use_icount) {
> +        return 1000;
> +    }
> +#endif
> +
>      if (!vm_running)
>          timeout = 5000;
>      else {
> @@ -1110,8 +1116,5 @@ int qemu_calculate_timeout(void)
>      }
>  
>      return timeout;
> -#else /* CONFIG_IOTHREAD */
> -    return 1000;
> -#endif
>  }
>  
> 
> 

This logic and timeout values were imported on iothread merge. And I bet
at least the timeout value of 1s (vs. 5s) can still be found in
qemu-kvm. Maybe someone over there can remember the rationales behind
choosing this value.

Jan
Jan Kiszka - Jan. 18, 2011, 10:17 a.m.
On 2011-01-18 11:00, Jan Kiszka wrote:
> On 2011-01-18 01:19, Edgar E. Iglesias wrote:
>> On Mon, Jan 17, 2011 at 11:03:08AM +0100, Edgar E. Iglesias wrote:
>>> Hi,
>>>
>>> I'm running an io-thread enabled qemu-system-mipsel with icount.
>>> When the guest (linux) goes to sleep through the wait insn (waiting
>>> to be woken up by future timer interrupts), the thing deadlocks.
>>>
>>> IIUC, this is because vm timers are driven by icount, but the CPU is
>>> halted so icount makes no progress and time stands still.
>>>
>>> I've locally disabled vcpu halting when icount is enabled, that
>>> works around my problem but of course makes qemu consume 100% host cpu.
>>>
>>> I don't know why I only see this problem with io-thread builds?
>>> Could be related timing and luck.
>>>
>>> Would be interesting to know if someone has any info on how this was
>>> intended to work (if it was)? And if there are ideas for better
>>> workarounds or fixes that don't disable vcpu halting entirely.
>>
>> Hi,
>>
>> I've found the problem. For some reason io-thread builds use a
>> static timeout for wait loops. The entire chunk of code that
>> makes sure qemu_icount makes forward progress when the CPU's
>> are idle has been ifdef'ed away...
>>
>> This fixes the problem for me, hopefully without affecting
>> io-thread runs without icount.
>>
>> commit 0f4f3a919952500b487b438c5520f07a1c6be35b
>> Author: Edgar E. Iglesias <edgar@axis.com>
>> Date:   Tue Jan 18 01:01:57 2011 +0100
>>
>>     qemu-timer: Fix timeout calc for io-thread with icount
>>     
>>     Make sure we always make forward progress with qemu_icount to
>>     avoid deadlocks. For io-thread, use the static 1000 timeout
>>     only if icount is disabled.
>>     
>>     Signed-off-by: Edgar E. Iglesias <edgar@axis.com>
>>
>> diff --git a/qemu-timer.c b/qemu-timer.c
>> index 95814af..db1ec49 100644
>> --- a/qemu-timer.c
>> +++ b/qemu-timer.c
>> @@ -110,7 +110,6 @@ static int64_t cpu_get_clock(void)
>>      }
>>  }
>>  
>> -#ifndef CONFIG_IOTHREAD
>>  static int64_t qemu_icount_delta(void)
>>  {
>>      if (!use_icount) {
>> @@ -124,7 +123,6 @@ static int64_t qemu_icount_delta(void)
>>          return cpu_get_icount() - cpu_get_clock();
>>      }
>>  }
>> -#endif
>>  
>>  /* enable cpu_get_ticks() */
>>  void cpu_enable_ticks(void)
>> @@ -1077,9 +1075,17 @@ void quit_timers(void)
>>  
>>  int qemu_calculate_timeout(void)
>>  {
>> -#ifndef CONFIG_IOTHREAD
>>      int timeout;
>>  
>> +#ifdef CONFIG_IOTHREAD
>> +    /* When using icount, making forward progress with qemu_icount when the
>> +       guest CPU is idle is critical. We only use the static io-thread timeout
>> +       for non icount runs.  */
>> +    if (!use_icount) {
>> +        return 1000;
>> +    }
>> +#endif
>> +
>>      if (!vm_running)
>>          timeout = 5000;
>>      else {
>> @@ -1110,8 +1116,5 @@ int qemu_calculate_timeout(void)
>>      }
>>  
>>      return timeout;
>> -#else /* CONFIG_IOTHREAD */
>> -    return 1000;
>> -#endif
>>  }
>>  
>>
>>
> 
> This logic and timeout values were imported on iothread merge. And I bet
> at least the timeout value of 1s (vs. 5s) can still be found in
> qemu-kvm. Maybe someone over there can remember the rationales behind
> choosing this value.

Correction: qemu-kvm does _not_ use a fixed timeout value for the
iothread nor the removed code path (as CONFIG_IOTHREAD is off in
qemu-kvm, that tree does not even build when you enable it).

Still, the reason for once introducing this difference to qemu should be
reflected.

Jan
Marcelo Tosatti - Jan. 19, 2011, 5:02 p.m.
On Tue, Jan 18, 2011 at 11:00:57AM +0100, Jan Kiszka wrote:
> On 2011-01-18 01:19, Edgar E. Iglesias wrote:
> > On Mon, Jan 17, 2011 at 11:03:08AM +0100, Edgar E. Iglesias wrote:
> >> Hi,
> >>
> >> I'm running an io-thread enabled qemu-system-mipsel with icount.
> >> When the guest (linux) goes to sleep through the wait insn (waiting
> >> to be woken up by future timer interrupts), the thing deadlocks.
> >>
> >> IIUC, this is because vm timers are driven by icount, but the CPU is
> >> halted so icount makes no progress and time stands still.
> >>
> >> I've locally disabled vcpu halting when icount is enabled, that
> >> works around my problem but of course makes qemu consume 100% host cpu.
> >>
> >> I don't know why I only see this problem with io-thread builds?
> >> Could be related timing and luck.
> >>
> >> Would be interesting to know if someone has any info on how this was
> >> intended to work (if it was)? And if there are ideas for better
> >> workarounds or fixes that don't disable vcpu halting entirely.
> > 
> > Hi,
> > 
> > I've found the problem. For some reason io-thread builds use a
> > static timeout for wait loops. The entire chunk of code that
> > makes sure qemu_icount makes forward progress when the CPU's
> > are idle has been ifdef'ed away...
> > 
> > This fixes the problem for me, hopefully without affecting
> > io-thread runs without icount.
> > 
> > commit 0f4f3a919952500b487b438c5520f07a1c6be35b
> > Author: Edgar E. Iglesias <edgar@axis.com>
> > Date:   Tue Jan 18 01:01:57 2011 +0100
> > 
> >     qemu-timer: Fix timeout calc for io-thread with icount
> >     
> >     Make sure we always make forward progress with qemu_icount to
> >     avoid deadlocks. For io-thread, use the static 1000 timeout
> >     only if icount is disabled.
> >     
> >     Signed-off-by: Edgar E. Iglesias <edgar@axis.com>
> > 
> > diff --git a/qemu-timer.c b/qemu-timer.c
> > index 95814af..db1ec49 100644
> > --- a/qemu-timer.c
> > +++ b/qemu-timer.c
> > @@ -110,7 +110,6 @@ static int64_t cpu_get_clock(void)
> >      }
> >  }
> >  
> > -#ifndef CONFIG_IOTHREAD
> >  static int64_t qemu_icount_delta(void)
> >  {
> >      if (!use_icount) {
> > @@ -124,7 +123,6 @@ static int64_t qemu_icount_delta(void)
> >          return cpu_get_icount() - cpu_get_clock();
> >      }
> >  }
> > -#endif
> >  
> >  /* enable cpu_get_ticks() */
> >  void cpu_enable_ticks(void)
> > @@ -1077,9 +1075,17 @@ void quit_timers(void)
> >  
> >  int qemu_calculate_timeout(void)
> >  {
> > -#ifndef CONFIG_IOTHREAD
> >      int timeout;
> >  
> > +#ifdef CONFIG_IOTHREAD
> > +    /* When using icount, making forward progress with qemu_icount when the
> > +       guest CPU is idle is critical. We only use the static io-thread timeout
> > +       for non icount runs.  */
> > +    if (!use_icount) {
> > +        return 1000;
> > +    }
> > +#endif
> > +
> >      if (!vm_running)
> >          timeout = 5000;
> >      else {
> > @@ -1110,8 +1116,5 @@ int qemu_calculate_timeout(void)
> >      }
> >  
> >      return timeout;
> > -#else /* CONFIG_IOTHREAD */
> > -    return 1000;
> > -#endif
> >  }
> >  
> > 
> > 
> 
> This logic and timeout values were imported on iothread merge. And I bet
> at least the timeout value of 1s (vs. 5s) can still be found in
> qemu-kvm. Maybe someone over there can remember the rationales behind
> choosing this value.
> 
> Jan

This timeout is for the main select() call. So there is not a lot
of reasoning, how long to wait when there's no activity on the file
descriptors.
Edgar Iglesias - Jan. 19, 2011, 7:02 p.m.
On Wed, Jan 19, 2011 at 03:02:26PM -0200, Marcelo Tosatti wrote:
> On Tue, Jan 18, 2011 at 11:00:57AM +0100, Jan Kiszka wrote:
> > On 2011-01-18 01:19, Edgar E. Iglesias wrote:
> > > On Mon, Jan 17, 2011 at 11:03:08AM +0100, Edgar E. Iglesias wrote:
> > >> Hi,
> > >>
> > >> I'm running an io-thread enabled qemu-system-mipsel with icount.
> > >> When the guest (linux) goes to sleep through the wait insn (waiting
> > >> to be woken up by future timer interrupts), the thing deadlocks.
> > >>
> > >> IIUC, this is because vm timers are driven by icount, but the CPU is
> > >> halted so icount makes no progress and time stands still.
> > >>
> > >> I've locally disabled vcpu halting when icount is enabled, that
> > >> works around my problem but of course makes qemu consume 100% host cpu.
> > >>
> > >> I don't know why I only see this problem with io-thread builds?
> > >> Could be related timing and luck.
> > >>
> > >> Would be interesting to know if someone has any info on how this was
> > >> intended to work (if it was)? And if there are ideas for better
> > >> workarounds or fixes that don't disable vcpu halting entirely.
> > > 
> > > Hi,
> > > 
> > > I've found the problem. For some reason io-thread builds use a
> > > static timeout for wait loops. The entire chunk of code that
> > > makes sure qemu_icount makes forward progress when the CPU's
> > > are idle has been ifdef'ed away...
> > > 
> > > This fixes the problem for me, hopefully without affecting
> > > io-thread runs without icount.
> > > 
> > > commit 0f4f3a919952500b487b438c5520f07a1c6be35b
> > > Author: Edgar E. Iglesias <edgar@axis.com>
> > > Date:   Tue Jan 18 01:01:57 2011 +0100
> > > 
> > >     qemu-timer: Fix timeout calc for io-thread with icount
> > >     
> > >     Make sure we always make forward progress with qemu_icount to
> > >     avoid deadlocks. For io-thread, use the static 1000 timeout
> > >     only if icount is disabled.
> > >     
> > >     Signed-off-by: Edgar E. Iglesias <edgar@axis.com>
> > > 
> > > diff --git a/qemu-timer.c b/qemu-timer.c
> > > index 95814af..db1ec49 100644
> > > --- a/qemu-timer.c
> > > +++ b/qemu-timer.c
> > > @@ -110,7 +110,6 @@ static int64_t cpu_get_clock(void)
> > >      }
> > >  }
> > >  
> > > -#ifndef CONFIG_IOTHREAD
> > >  static int64_t qemu_icount_delta(void)
> > >  {
> > >      if (!use_icount) {
> > > @@ -124,7 +123,6 @@ static int64_t qemu_icount_delta(void)
> > >          return cpu_get_icount() - cpu_get_clock();
> > >      }
> > >  }
> > > -#endif
> > >  
> > >  /* enable cpu_get_ticks() */
> > >  void cpu_enable_ticks(void)
> > > @@ -1077,9 +1075,17 @@ void quit_timers(void)
> > >  
> > >  int qemu_calculate_timeout(void)
> > >  {
> > > -#ifndef CONFIG_IOTHREAD
> > >      int timeout;
> > >  
> > > +#ifdef CONFIG_IOTHREAD
> > > +    /* When using icount, making forward progress with qemu_icount when the
> > > +       guest CPU is idle is critical. We only use the static io-thread timeout
> > > +       for non icount runs.  */
> > > +    if (!use_icount) {
> > > +        return 1000;
> > > +    }
> > > +#endif
> > > +
> > >      if (!vm_running)
> > >          timeout = 5000;
> > >      else {
> > > @@ -1110,8 +1116,5 @@ int qemu_calculate_timeout(void)
> > >      }
> > >  
> > >      return timeout;
> > > -#else /* CONFIG_IOTHREAD */
> > > -    return 1000;
> > > -#endif
> > >  }
> > >  
> > > 
> > > 
> > 
> > This logic and timeout values were imported on iothread merge. And I bet
> > at least the timeout value of 1s (vs. 5s) can still be found in
> > qemu-kvm. Maybe someone over there can remember the rationales behind
> > choosing this value.
> > 
> > Jan
> 
> This timeout is for the main select() call. So there is not a lot
> of reasoning, how long to wait when there's no activity on the file
> descriptors.

OK, I suspected something like that. Thanks both of you for the info.
I'll give people a couple of days to complain at the patch, if noone
does I'll apply it.

Cheers
Edgar Iglesias - Jan. 23, 2011, 4:09 a.m.
On Wed, Jan 19, 2011 at 08:02:28PM +0100, Edgar E. Iglesias wrote:
> On Wed, Jan 19, 2011 at 03:02:26PM -0200, Marcelo Tosatti wrote:
> > On Tue, Jan 18, 2011 at 11:00:57AM +0100, Jan Kiszka wrote:
> > > On 2011-01-18 01:19, Edgar E. Iglesias wrote:
> > > > On Mon, Jan 17, 2011 at 11:03:08AM +0100, Edgar E. Iglesias wrote:
> > > >> Hi,
> > > >>
> > > >> I'm running an io-thread enabled qemu-system-mipsel with icount.
> > > >> When the guest (linux) goes to sleep through the wait insn (waiting
> > > >> to be woken up by future timer interrupts), the thing deadlocks.
> > > >>
> > > >> IIUC, this is because vm timers are driven by icount, but the CPU is
> > > >> halted so icount makes no progress and time stands still.
> > > >>
> > > >> I've locally disabled vcpu halting when icount is enabled, that
> > > >> works around my problem but of course makes qemu consume 100% host cpu.
> > > >>
> > > >> I don't know why I only see this problem with io-thread builds?
> > > >> Could be related timing and luck.
> > > >>
> > > >> Would be interesting to know if someone has any info on how this was
> > > >> intended to work (if it was)? And if there are ideas for better
> > > >> workarounds or fixes that don't disable vcpu halting entirely.
> > > > 
> > > > Hi,
> > > > 
> > > > I've found the problem. For some reason io-thread builds use a
> > > > static timeout for wait loops. The entire chunk of code that
> > > > makes sure qemu_icount makes forward progress when the CPU's
> > > > are idle has been ifdef'ed away...
> > > > 
> > > > This fixes the problem for me, hopefully without affecting
> > > > io-thread runs without icount.
> > > > 
> > > > commit 0f4f3a919952500b487b438c5520f07a1c6be35b
> > > > Author: Edgar E. Iglesias <edgar@axis.com>
> > > > Date:   Tue Jan 18 01:01:57 2011 +0100
> > > > 
> > > >     qemu-timer: Fix timeout calc for io-thread with icount
> > > >     
> > > >     Make sure we always make forward progress with qemu_icount to
> > > >     avoid deadlocks. For io-thread, use the static 1000 timeout
> > > >     only if icount is disabled.
> > > >     
> > > >     Signed-off-by: Edgar E. Iglesias <edgar@axis.com>
> > > > 
> > > > diff --git a/qemu-timer.c b/qemu-timer.c
> > > > index 95814af..db1ec49 100644
> > > > --- a/qemu-timer.c
> > > > +++ b/qemu-timer.c
> > > > @@ -110,7 +110,6 @@ static int64_t cpu_get_clock(void)
> > > >      }
> > > >  }
> > > >  
> > > > -#ifndef CONFIG_IOTHREAD
> > > >  static int64_t qemu_icount_delta(void)
> > > >  {
> > > >      if (!use_icount) {
> > > > @@ -124,7 +123,6 @@ static int64_t qemu_icount_delta(void)
> > > >          return cpu_get_icount() - cpu_get_clock();
> > > >      }
> > > >  }
> > > > -#endif
> > > >  
> > > >  /* enable cpu_get_ticks() */
> > > >  void cpu_enable_ticks(void)
> > > > @@ -1077,9 +1075,17 @@ void quit_timers(void)
> > > >  
> > > >  int qemu_calculate_timeout(void)
> > > >  {
> > > > -#ifndef CONFIG_IOTHREAD
> > > >      int timeout;
> > > >  
> > > > +#ifdef CONFIG_IOTHREAD
> > > > +    /* When using icount, making forward progress with qemu_icount when the
> > > > +       guest CPU is idle is critical. We only use the static io-thread timeout
> > > > +       for non icount runs.  */
> > > > +    if (!use_icount) {
> > > > +        return 1000;
> > > > +    }
> > > > +#endif
> > > > +
> > > >      if (!vm_running)
> > > >          timeout = 5000;
> > > >      else {
> > > > @@ -1110,8 +1116,5 @@ int qemu_calculate_timeout(void)
> > > >      }
> > > >  
> > > >      return timeout;
> > > > -#else /* CONFIG_IOTHREAD */
> > > > -    return 1000;
> > > > -#endif
> > > >  }
> > > >  
> > > > 
> > > > 
> > > 
> > > This logic and timeout values were imported on iothread merge. And I bet
> > > at least the timeout value of 1s (vs. 5s) can still be found in
> > > qemu-kvm. Maybe someone over there can remember the rationales behind
> > > choosing this value.
> > > 
> > > Jan
> > 
> > This timeout is for the main select() call. So there is not a lot
> > of reasoning, how long to wait when there's no activity on the file
> > descriptors.
> 
> OK, I suspected something like that. Thanks both of you for the info.
> I'll give people a couple of days to complain at the patch, if noone
> does I'll apply it.

Silence - so I've applied this one, thanks.

Cheers

Patch

diff --git a/qemu-timer.c b/qemu-timer.c
index 95814af..db1ec49 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -110,7 +110,6 @@  static int64_t cpu_get_clock(void)
     }
 }
 
-#ifndef CONFIG_IOTHREAD
 static int64_t qemu_icount_delta(void)
 {
     if (!use_icount) {
@@ -124,7 +123,6 @@  static int64_t qemu_icount_delta(void)
         return cpu_get_icount() - cpu_get_clock();
     }
 }
-#endif
 
 /* enable cpu_get_ticks() */
 void cpu_enable_ticks(void)
@@ -1077,9 +1075,17 @@  void quit_timers(void)
 
 int qemu_calculate_timeout(void)
 {
-#ifndef CONFIG_IOTHREAD
     int timeout;
 
+#ifdef CONFIG_IOTHREAD
+    /* When using icount, making forward progress with qemu_icount when the
+       guest CPU is idle is critical. We only use the static io-thread timeout
+       for non icount runs.  */
+    if (!use_icount) {
+        return 1000;
+    }
+#endif
+
     if (!vm_running)
         timeout = 5000;
     else {
@@ -1110,8 +1116,5 @@  int qemu_calculate_timeout(void)
     }
 
     return timeout;
-#else /* CONFIG_IOTHREAD */
-    return 1000;
-#endif
 }