Patchwork Re: [PATCH 2/7] Enable I/O thread and VNC threads by default

login
register
mail settings
Submitter Marcelo Tosatti
Date Feb. 7, 2011, 4:03 p.m.
Message ID <20110207160350.GA26332@amt.cnet>
Download mbox | patch
Permalink /patch/82135/
State New
Headers show

Comments

Marcelo Tosatti - Feb. 7, 2011, 4:03 p.m.
On Mon, Feb 07, 2011 at 08:12:55AM -0200, Marcelo Tosatti wrote:
> > > One more thing I didn't mention on the email-thread or on IRC is
> > > that last time I checked, qemu with io-thread was performing
> > > significantly slower than non io-thread builds. That was with
> > > TCG emulation (not kvm). Somewhere between 5 - 10% slower, IIRC.
> 
> Can you recall what was the test ?
> 
> > > Also, although -icount & iothread no longer deadlocks, icount
> > > still sometimes performs incredibly slow with the io-thread (compared
> > > to non-io-thread qemu). In particular when not using -icount auto but
> > > a fixed ticks per insn values. Sometimes it's so slow I thought it
> > > actually deadlocked, but no it was crawling :) I haven't had time
> > > to look at it any closer but I hope to do soon.

Edgar, please give the attached patch a try with fixed icount value. The
calculation for next event makes no sense for iothread timeout, only for
vcpu context.

> > > 
> > > These issues should be fixable though, so I'm not arguing against
> > > enabling it per default. Just mentioning what I've seen FYI..
> > 
> > Right, remember seeing 20% added overhead for network copy with TCG on
> > the initial iothread merge.
> 
> This is not the case anymore, network transfer speed is comparable.
> Probably due to SIG_IPI delivery being reliable, which was fixed later.

Is there any other issue that prevents turning CONFIG_IOTHREAD on by
default?
Paolo Bonzini - Feb. 7, 2011, 4:23 p.m.
On 02/07/2011 05:03 PM, Marcelo Tosatti wrote:
> Is there any other issue that prevents turning CONFIG_IOTHREAD on by
> default?

I think Windows support.

Signal support is actually easy because we can "hack" the IPI as 
"suspend the VCPU thread+do work in the iothread context+resume the VCPU 
thread" (the IPI handler doesn't longjmp).

Threading primitives support is tricky but not hard (there is lots of 
code around, especially if you can make assumptions such as "always hold 
the mutex while signaling a cond. variable").

Paolo
Jan Kiszka - Feb. 7, 2011, 5:10 p.m.
On 2011-02-07 17:23, Paolo Bonzini wrote:
> On 02/07/2011 05:03 PM, Marcelo Tosatti wrote:
>> Is there any other issue that prevents turning CONFIG_IOTHREAD on by
>> default?
> 
> I think Windows support.
> 
> Signal support is actually easy because we can "hack" the IPI as
> "suspend the VCPU thread+do work in the iothread context+resume the VCPU
> thread" (the IPI handler doesn't longjmp).
> 
> Threading primitives support is tricky but not hard (there is lots of
> code around, especially if you can make assumptions such as "always hold
> the mutex while signaling a cond. variable").

!CONFIG_IOTHREAD code is doomed to bitrot once we switch to default
iothread mode. So if Windows support is not converted to a threading
model with moderate differences to POSIX, it will likely bitrot a well.
Therefore, conversion should be started rather sooner than later (by
someone interested in that platform).

Jan
Edgar Iglesias - Feb. 7, 2011, 6:35 p.m.
On Mon, Feb 07, 2011 at 02:03:50PM -0200, Marcelo Tosatti wrote:
> On Mon, Feb 07, 2011 at 08:12:55AM -0200, Marcelo Tosatti wrote:
> > > > One more thing I didn't mention on the email-thread or on IRC is
> > > > that last time I checked, qemu with io-thread was performing
> > > > significantly slower than non io-thread builds. That was with
> > > > TCG emulation (not kvm). Somewhere between 5 - 10% slower, IIRC.
> > 
> > Can you recall what was the test ?
> > 
> > > > Also, although -icount & iothread no longer deadlocks, icount
> > > > still sometimes performs incredibly slow with the io-thread (compared
> > > > to non-io-thread qemu). In particular when not using -icount auto but
> > > > a fixed ticks per insn values. Sometimes it's so slow I thought it
> > > > actually deadlocked, but no it was crawling :) I haven't had time
> > > > to look at it any closer but I hope to do soon.
> 
> Edgar, please give the attached patch a try with fixed icount value. The
> calculation for next event makes no sense for iothread timeout, only for
> vcpu context.


Thanks Marcelo, this patch fixes the problems I was seeing here.

Cheers


> diff --git a/cpus.c b/cpus.c
> index 9c50a34..2280db1 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -748,7 +748,7 @@ static void qemu_tcg_wait_io_event(void)
>      CPUState *env;
>  
>      while (!any_cpu_has_work())
> -        qemu_cond_timedwait(tcg_halt_cond, &qemu_global_mutex, 1000);
> +        qemu_cond_timedwait(tcg_halt_cond, &qemu_global_mutex, qemu_calculate_timeout());
>  
>      qemu_mutex_unlock(&qemu_global_mutex);
>  
> diff --git a/vl.c b/vl.c
> index 837be97..dbd81a1 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -1323,7 +1323,7 @@ void main_loop_wait(int nonblocking)
>      if (nonblocking)
>          timeout = 0;
>      else {
> -        timeout = qemu_calculate_timeout();
> +        timeout = 1000;
>          qemu_bh_update_timeout(&timeout);
>      }
>
Aurelien Jarno - Feb. 7, 2011, 8:44 p.m.
On Mon, Feb 07, 2011 at 02:03:50PM -0200, Marcelo Tosatti wrote:
> On Mon, Feb 07, 2011 at 08:12:55AM -0200, Marcelo Tosatti wrote:
> > > > One more thing I didn't mention on the email-thread or on IRC is
> > > > that last time I checked, qemu with io-thread was performing
> > > > significantly slower than non io-thread builds. That was with
> > > > TCG emulation (not kvm). Somewhere between 5 - 10% slower, IIRC.
> > 
> > Can you recall what was the test ?
> > 

It's also something I've seen using network transfer in guest. IIRC the
biggest slowdown was using the smc91c111 card under qemu-system-arm
where it was about 20% slower. Other cards on other architectures (I
remember testing powerpc, mips and sh4) are more in the 5 to 10 % area.
Anthony Liguori - Feb. 7, 2011, 9:02 p.m.
On 02/07/2011 11:10 AM, Jan Kiszka wrote:
> On 2011-02-07 17:23, Paolo Bonzini wrote:
>    
>> On 02/07/2011 05:03 PM, Marcelo Tosatti wrote:
>>      
>>> Is there any other issue that prevents turning CONFIG_IOTHREAD on by
>>> default?
>>>        
>> I think Windows support.
>>
>> Signal support is actually easy because we can "hack" the IPI as
>> "suspend the VCPU thread+do work in the iothread context+resume the VCPU
>> thread" (the IPI handler doesn't longjmp).
>>
>> Threading primitives support is tricky but not hard (there is lots of
>> code around, especially if you can make assumptions such as "always hold
>> the mutex while signaling a cond. variable").
>>      
> !CONFIG_IOTHREAD code is doomed to bitrot once we switch to default
> iothread mode. So if Windows support is not converted to a threading
> model with moderate differences to POSIX, it will likely bitrot a well.
> Therefore, conversion should be started rather sooner than later (by
> someone interested in that platform).
>    

As far as I'm concerned, Windows support is already deprecated as noone 
has stepped up to enhance it or support for a number of years now.  We 
shouldn't remove existing code that supports it or refuse to take 
reasonable patches but if enabling IO thread by default breaks it, so be it.

Regards,

Anthony Liguori

> Jan
>
>
Scott Wood - Feb. 7, 2011, 9:30 p.m.
On Mon, 7 Feb 2011 14:03:50 -0200
Marcelo Tosatti <mtosatti@redhat.com> wrote:

> Is there any other issue that prevents turning CONFIG_IOTHREAD on by
> default?
> 

This patch is needed for ppce500_mpc8544ds and ppc440_bamboo to work with
I/O thread enabled:

http://patchwork.ozlabs.org/patch/66743/

-Scott
Aurelien Jarno - Feb. 7, 2011, 9:45 p.m.
On Mon, Feb 07, 2011 at 03:02:03PM -0600, Anthony Liguori wrote:
> On 02/07/2011 11:10 AM, Jan Kiszka wrote:
>> On 2011-02-07 17:23, Paolo Bonzini wrote:
>>    
>>> On 02/07/2011 05:03 PM, Marcelo Tosatti wrote:
>>>      
>>>> Is there any other issue that prevents turning CONFIG_IOTHREAD on by
>>>> default?
>>>>        
>>> I think Windows support.
>>>
>>> Signal support is actually easy because we can "hack" the IPI as
>>> "suspend the VCPU thread+do work in the iothread context+resume the VCPU
>>> thread" (the IPI handler doesn't longjmp).
>>>
>>> Threading primitives support is tricky but not hard (there is lots of
>>> code around, especially if you can make assumptions such as "always hold
>>> the mutex while signaling a cond. variable").
>>>      
>> !CONFIG_IOTHREAD code is doomed to bitrot once we switch to default
>> iothread mode. So if Windows support is not converted to a threading
>> model with moderate differences to POSIX, it will likely bitrot a well.
>> Therefore, conversion should be started rather sooner than later (by
>> someone interested in that platform).
>>    
>
> As far as I'm concerned, Windows support is already deprecated as noone  
> has stepped up to enhance it or support for a number of years now.  We  
> shouldn't remove existing code that supports it or refuse to take  
> reasonable patches but if enabling IO thread by default breaks it, so be 
> it.

As far as I see, Blue Swirl and Stefan Weil are regularly committing
fixes for win32. Stefan Weil is also providing win32 binaries on his
website [1]. I wouldn't call that deprecated.

[1] http://qemu.weilnetz.de/
Anthony Liguori - Feb. 8, 2011, 2:09 a.m.
On 02/07/2011 03:45 PM, Aurelien Jarno wrote:
> On Mon, Feb 07, 2011 at 03:02:03PM -0600, Anthony Liguori wrote:
>    
>> As far as I'm concerned, Windows support is already deprecated as noone
>> has stepped up to enhance it or support for a number of years now.  We
>> shouldn't remove existing code that supports it or refuse to take
>> reasonable patches but if enabling IO thread by default breaks it, so be
>> it.
>>      
> As far as I see, Blue Swirl and Stefan Weil are regularly committing
> fixes for win32. Stefan Weil is also providing win32 binaries on his
> website [1]. I wouldn't call that deprecated.
>    

Occasional compile fixes is a long way from something that is regularly 
tested and well maintained.

Win32 still doesn't have a proper AIO implementation which is probably 
close to a 4 year old FIXME.

Regards,

Anthony Liguori

> [1] http://qemu.weilnetz.de/
>
>
Aurelien Jarno - Feb. 8, 2011, 7:26 a.m.
On Mon, Feb 07, 2011 at 08:09:52PM -0600, Anthony Liguori wrote:
> On 02/07/2011 03:45 PM, Aurelien Jarno wrote:
>> On Mon, Feb 07, 2011 at 03:02:03PM -0600, Anthony Liguori wrote:
>>    
>>> As far as I'm concerned, Windows support is already deprecated as noone
>>> has stepped up to enhance it or support for a number of years now.  We
>>> shouldn't remove existing code that supports it or refuse to take
>>> reasonable patches but if enabling IO thread by default breaks it, so be
>>> it.
>>>      
>> As far as I see, Blue Swirl and Stefan Weil are regularly committing
>> fixes for win32. Stefan Weil is also providing win32 binaries on his
>> website [1]. I wouldn't call that deprecated.
>>    
>
> Occasional compile fixes is a long way from something that is regularly  
> tested and well maintained.
>
> Win32 still doesn't have a proper AIO implementation which is probably  
> close to a 4 year old FIXME.
>

I forget to remember when we decided that AIO should be implemented on
any host OS. Any pointer?
Paolo Bonzini - Feb. 8, 2011, 8:08 a.m.
On 02/08/2011 08:26 AM, Aurelien Jarno wrote:
> I forget to remember when we decided that AIO should be implemented on
> any host OS. Any pointer?

To be fair, I/O-heavy workloads are almost unusable without AIO.  For 
Window targets, they also crash under SMP due to the Windows AP 
watchdog.  But then TCG and SMP do not go very well together anyway.

However, I think deprecating Win32 support would be a very bad idea.

Paolo
Jan Kiszka - Feb. 8, 2011, 8:50 a.m.
On 2011-02-08 09:08, Paolo Bonzini wrote:
> On 02/08/2011 08:26 AM, Aurelien Jarno wrote:
>> I forget to remember when we decided that AIO should be implemented on
>> any host OS. Any pointer?
> 
> To be fair, I/O-heavy workloads are almost unusable without AIO.  For 
> Window targets, they also crash under SMP due to the Windows AP 
> watchdog.  But then TCG and SMP do not go very well together anyway.
> 
> However, I think deprecating Win32 support would be a very bad idea.

It would be too early at this point.

But if Windows is once the only reason to keep tons of hardly tested
code paths around or to invest significant additional effort to change
logic or interfaces in this area, than I would prefer that step. I'm
hacking on IOTHREAD vs. !IOTHREAD for some weeks now, and all those
subtle differences are really a PITA and source of various breakages.

People interested in that platform should finally realize that its fate
is coupled to reducing the #ifdefs as well as the design differences we
see right now and even more in the future.

Jan
Aurelien Jarno - Feb. 8, 2011, 9:05 a.m.
Jan Kiszka a écrit :
> On 2011-02-08 09:08, Paolo Bonzini wrote:
>> On 02/08/2011 08:26 AM, Aurelien Jarno wrote:
>>> I forget to remember when we decided that AIO should be implemented on
>>> any host OS. Any pointer?
>> To be fair, I/O-heavy workloads are almost unusable without AIO.  For 
>> Window targets, they also crash under SMP due to the Windows AP 
>> watchdog.  But then TCG and SMP do not go very well together anyway.
>>
>> However, I think deprecating Win32 support would be a very bad idea.
> 
> It would be too early at this point.
> 
> But if Windows is once the only reason to keep tons of hardly tested
> code paths around or to invest significant additional effort to change
> logic or interfaces in this area, than I would prefer that step. I'm
> hacking on IOTHREAD vs. !IOTHREAD for some weeks now, and all those
> subtle differences are really a PITA and source of various breakages.
> 
> People interested in that platform should finally realize that its fate
> is coupled to reducing the #ifdefs as well as the design differences we
> see right now and even more in the future.
> 

The guilty here is IOTHREAD. Windows support predates IOTHREAD concept,
it's just that people who introduce IOTHREAD didn't care about Windows
support at all and added these #ifdef. Disabling Windows support because
of that is not fair.

We should probably get rid of KVM support in QEMU, so if someone has an
idea for a cool TCG feature that can't be supported in KVM, it's the
moment to submit it. We can add it with #ifdef, and in one year just ask
for KVM support removal.
Anthony Liguori - Feb. 8, 2011, 9:12 a.m.
On 02/08/2011 03:05 AM, Aurelien Jarno wrote:
> Jan Kiszka a écrit :
>    
>> On 2011-02-08 09:08, Paolo Bonzini wrote:
>>      
>>> On 02/08/2011 08:26 AM, Aurelien Jarno wrote:
>>>        
>>>> I forget to remember when we decided that AIO should be implemented on
>>>> any host OS. Any pointer?
>>>>          
>>> To be fair, I/O-heavy workloads are almost unusable without AIO.  For
>>> Window targets, they also crash under SMP due to the Windows AP
>>> watchdog.  But then TCG and SMP do not go very well together anyway.
>>>
>>> However, I think deprecating Win32 support would be a very bad idea.
>>>        
>> It would be too early at this point.
>>
>> But if Windows is once the only reason to keep tons of hardly tested
>> code paths around or to invest significant additional effort to change
>> logic or interfaces in this area, than I would prefer that step. I'm
>> hacking on IOTHREAD vs. !IOTHREAD for some weeks now, and all those
>> subtle differences are really a PITA and source of various breakages.
>>
>> People interested in that platform should finally realize that its fate
>> is coupled to reducing the #ifdefs as well as the design differences we
>> see right now and even more in the future.
>>
>>      
> The guilty here is IOTHREAD. Windows support predates IOTHREAD concept,
>    

IOTHREAD is actually just as necessary for TCG as it is for KVM.  
Otherwise, you have a signal select race that cannot be avoided.

QEMU has never "supported" Windows.  It happens to compile on Windows, 
but historically the Windows build has been non-functional for long 
periods of time and is still missing basic features (like AIO).

Regards,

Anthony Liguori

> it's just that people who introduce IOTHREAD didn't care about Windows
> support at all and added these #ifdef. Disabling Windows support because
> of that is not fair.
>
> We should probably get rid of KVM support in QEMU, so if someone has an
> idea for a cool TCG feature that can't be supported in KVM, it's the
> moment to submit it. We can add it with #ifdef, and in one year just ask
> for KVM support removal.
>
>
Paolo Bonzini - Feb. 8, 2011, 9:49 a.m.
On 02/08/2011 10:12 AM, Anthony Liguori wrote:
>> The guilty here is IOTHREAD. Windows support predates IOTHREAD concept,
>
> QEMU has never "supported" Windows.

I think both assertions are false.

What's true is that the Win32 port has never evolved beyond "barely 
functional", at least by the standards with which QEMU is judged under 
Linux.

Paolo
Jan Kiszka - Feb. 8, 2011, 9:51 a.m.
On 2011-02-08 10:05, Aurelien Jarno wrote:
> Jan Kiszka a écrit :
>> On 2011-02-08 09:08, Paolo Bonzini wrote:
>>> On 02/08/2011 08:26 AM, Aurelien Jarno wrote:
>>>> I forget to remember when we decided that AIO should be implemented on
>>>> any host OS. Any pointer?
>>> To be fair, I/O-heavy workloads are almost unusable without AIO.  For 
>>> Window targets, they also crash under SMP due to the Windows AP 
>>> watchdog.  But then TCG and SMP do not go very well together anyway.
>>>
>>> However, I think deprecating Win32 support would be a very bad idea.
>>
>> It would be too early at this point.
>>
>> But if Windows is once the only reason to keep tons of hardly tested
>> code paths around or to invest significant additional effort to change
>> logic or interfaces in this area, than I would prefer that step. I'm
>> hacking on IOTHREAD vs. !IOTHREAD for some weeks now, and all those
>> subtle differences are really a PITA and source of various breakages.
>>
>> People interested in that platform should finally realize that its fate
>> is coupled to reducing the #ifdefs as well as the design differences we
>> see right now and even more in the future.
>>
> 
> The guilty here is IOTHREAD. Windows support predates IOTHREAD concept,
> it's just that people who introduce IOTHREAD didn't care about Windows
> support at all and added these #ifdef. Disabling Windows support because
> of that is not fair.

The TCG execution model won't scale long-term. It's already a main to
boot a quad or just dual core VM, even more when your host has at least
as many real cores. I'm sure we'll see multi-threaded TCG CPUs in the
future, and the iothread will just be one of 7, 17 or 257 threads.

Jan
Aurelien Jarno - Feb. 8, 2011, 9:58 a.m.
Jan Kiszka a écrit :
> On 2011-02-08 10:05, Aurelien Jarno wrote:
>> Jan Kiszka a écrit :
>>> On 2011-02-08 09:08, Paolo Bonzini wrote:
>>>> On 02/08/2011 08:26 AM, Aurelien Jarno wrote:
>>>>> I forget to remember when we decided that AIO should be implemented on
>>>>> any host OS. Any pointer?
>>>> To be fair, I/O-heavy workloads are almost unusable without AIO.  For 
>>>> Window targets, they also crash under SMP due to the Windows AP 
>>>> watchdog.  But then TCG and SMP do not go very well together anyway.
>>>>
>>>> However, I think deprecating Win32 support would be a very bad idea.
>>> It would be too early at this point.
>>>
>>> But if Windows is once the only reason to keep tons of hardly tested
>>> code paths around or to invest significant additional effort to change
>>> logic or interfaces in this area, than I would prefer that step. I'm
>>> hacking on IOTHREAD vs. !IOTHREAD for some weeks now, and all those
>>> subtle differences are really a PITA and source of various breakages.
>>>
>>> People interested in that platform should finally realize that its fate
>>> is coupled to reducing the #ifdefs as well as the design differences we
>>> see right now and even more in the future.
>>>
>> The guilty here is IOTHREAD. Windows support predates IOTHREAD concept,
>> it's just that people who introduce IOTHREAD didn't care about Windows
>> support at all and added these #ifdef. Disabling Windows support because
>> of that is not fair.
> 
> The TCG execution model won't scale long-term. It's already a main to
> boot a quad or just dual core VM, even more when your host has at least
> as many real cores. I'm sure we'll see multi-threaded TCG CPUs in the
> future, and the iothread will just be one of 7, 17 or 257 threads.
> 

And what's the issue with that? People don't always look for performance
when using QEMU. They even often try to emulate old machines (and non
x86 ones), which anyway only have one CPU. This won't change in 5 years,
the only thing is that those machines will be 5 years older.

People have to keep in mind that QEMU doesn't mean only virtualization
and doesn't mean only x86.
Jan Kiszka - Feb. 8, 2011, 10:03 a.m.
On 2011-02-08 10:58, Aurelien Jarno wrote:
> Jan Kiszka a écrit :
>> On 2011-02-08 10:05, Aurelien Jarno wrote:
>>> Jan Kiszka a écrit :
>>>> On 2011-02-08 09:08, Paolo Bonzini wrote:
>>>>> On 02/08/2011 08:26 AM, Aurelien Jarno wrote:
>>>>>> I forget to remember when we decided that AIO should be implemented on
>>>>>> any host OS. Any pointer?
>>>>> To be fair, I/O-heavy workloads are almost unusable without AIO.  For 
>>>>> Window targets, they also crash under SMP due to the Windows AP 
>>>>> watchdog.  But then TCG and SMP do not go very well together anyway.
>>>>>
>>>>> However, I think deprecating Win32 support would be a very bad idea.
>>>> It would be too early at this point.
>>>>
>>>> But if Windows is once the only reason to keep tons of hardly tested
>>>> code paths around or to invest significant additional effort to change
>>>> logic or interfaces in this area, than I would prefer that step. I'm
>>>> hacking on IOTHREAD vs. !IOTHREAD for some weeks now, and all those
>>>> subtle differences are really a PITA and source of various breakages.
>>>>
>>>> People interested in that platform should finally realize that its fate
>>>> is coupled to reducing the #ifdefs as well as the design differences we
>>>> see right now and even more in the future.
>>>>
>>> The guilty here is IOTHREAD. Windows support predates IOTHREAD concept,
>>> it's just that people who introduce IOTHREAD didn't care about Windows
>>> support at all and added these #ifdef. Disabling Windows support because
>>> of that is not fair.
>>
>> The TCG execution model won't scale long-term. It's already a main to
>> boot a quad or just dual core VM, even more when your host has at least
>> as many real cores. I'm sure we'll see multi-threaded TCG CPUs in the
>> future, and the iothread will just be one of 7, 17 or 257 threads.
>>
> 
> And what's the issue with that? People don't always look for performance
> when using QEMU. They even often try to emulate old machines (and non
> x86 ones), which anyway only have one CPU. This won't change in 5 years,
> the only thing is that those machines will be 5 years older.
> 
> People have to keep in mind that QEMU doesn't mean only virtualization
> and doesn't mean only x86.

I'm not talking about virtualization here. I'm talking about usable
emulation of today's (!) embedded multi-core platforms. It matters a lot
if your test roundtrip for booting into a SMP guest and running some
apps is a few 10 seconds, a few minutes or even not practically working.
Ever tried to boot a 16 core VM in emulation mode? I did, for fun. I
just hope I'll never depend on this for work.

Jan
Paolo Bonzini - Feb. 8, 2011, 10:06 a.m.
On 02/08/2011 10:58 AM, Aurelien Jarno wrote:
> And what's the issue with that? People don't always look for performance
> when using QEMU. They even often try to emulate old machines (and non
> x86 ones), which anyway only have one CPU. This won't change in 5 years,
> the only thing is that those machines will be 5 years older.
>
> People have to keep in mind that QEMU doesn't mean only virtualization
> and doesn't mean only x86.

AFAIU nobody is proposing to rip linux-user or TCG, just to improve its 
implementation.

You just as well have to understand that AIO means fewer Windows blue 
screens of death and not only better performance.

Paolo
Aurelien Jarno - Feb. 8, 2011, 10:06 a.m.
Jan Kiszka a écrit :
> On 2011-02-08 10:58, Aurelien Jarno wrote:
>> Jan Kiszka a écrit :
>>> On 2011-02-08 10:05, Aurelien Jarno wrote:
>>>> Jan Kiszka a écrit :
>>>>> On 2011-02-08 09:08, Paolo Bonzini wrote:
>>>>>> On 02/08/2011 08:26 AM, Aurelien Jarno wrote:
>>>>>>> I forget to remember when we decided that AIO should be implemented on
>>>>>>> any host OS. Any pointer?
>>>>>> To be fair, I/O-heavy workloads are almost unusable without AIO.  For 
>>>>>> Window targets, they also crash under SMP due to the Windows AP 
>>>>>> watchdog.  But then TCG and SMP do not go very well together anyway.
>>>>>>
>>>>>> However, I think deprecating Win32 support would be a very bad idea.
>>>>> It would be too early at this point.
>>>>>
>>>>> But if Windows is once the only reason to keep tons of hardly tested
>>>>> code paths around or to invest significant additional effort to change
>>>>> logic or interfaces in this area, than I would prefer that step. I'm
>>>>> hacking on IOTHREAD vs. !IOTHREAD for some weeks now, and all those
>>>>> subtle differences are really a PITA and source of various breakages.
>>>>>
>>>>> People interested in that platform should finally realize that its fate
>>>>> is coupled to reducing the #ifdefs as well as the design differences we
>>>>> see right now and even more in the future.
>>>>>
>>>> The guilty here is IOTHREAD. Windows support predates IOTHREAD concept,
>>>> it's just that people who introduce IOTHREAD didn't care about Windows
>>>> support at all and added these #ifdef. Disabling Windows support because
>>>> of that is not fair.
>>> The TCG execution model won't scale long-term. It's already a main to
>>> boot a quad or just dual core VM, even more when your host has at least
>>> as many real cores. I'm sure we'll see multi-threaded TCG CPUs in the
>>> future, and the iothread will just be one of 7, 17 or 257 threads.
>>>
>> And what's the issue with that? People don't always look for performance
>> when using QEMU. They even often try to emulate old machines (and non
>> x86 ones), which anyway only have one CPU. This won't change in 5 years,
>> the only thing is that those machines will be 5 years older.
>>
>> People have to keep in mind that QEMU doesn't mean only virtualization
>> and doesn't mean only x86.
> 
> I'm not talking about virtualization here. I'm talking about usable
> emulation of today's (!) embedded multi-core platforms. It matters a lot
> if your test roundtrip for booting into a SMP guest and running some
> apps is a few 10 seconds, a few minutes or even not practically working.
> Ever tried to boot a 16 core VM in emulation mode? I did, for fun. I
> just hope I'll never depend on this for work.

Yes, it's slow. But is it a problem? You assume that people use QEMU
only for emulating SMP platforms. This is a wrong assumption. Beside the
x86 target, only sparc really supports SMP emulation.
Alexander Graf - Feb. 8, 2011, 10:16 a.m.
On 08.02.2011, at 11:06, Aurelien Jarno wrote:

> Jan Kiszka a écrit :
>> On 2011-02-08 10:58, Aurelien Jarno wrote:
>>> Jan Kiszka a écrit :
>>>> On 2011-02-08 10:05, Aurelien Jarno wrote:
>>>>> Jan Kiszka a écrit :
>>>>>> On 2011-02-08 09:08, Paolo Bonzini wrote:
>>>>>>> On 02/08/2011 08:26 AM, Aurelien Jarno wrote:
>>>>>>>> I forget to remember when we decided that AIO should be implemented on
>>>>>>>> any host OS. Any pointer?
>>>>>>> To be fair, I/O-heavy workloads are almost unusable without AIO.  For 
>>>>>>> Window targets, they also crash under SMP due to the Windows AP 
>>>>>>> watchdog.  But then TCG and SMP do not go very well together anyway.
>>>>>>> 
>>>>>>> However, I think deprecating Win32 support would be a very bad idea.
>>>>>> It would be too early at this point.
>>>>>> 
>>>>>> But if Windows is once the only reason to keep tons of hardly tested
>>>>>> code paths around or to invest significant additional effort to change
>>>>>> logic or interfaces in this area, than I would prefer that step. I'm
>>>>>> hacking on IOTHREAD vs. !IOTHREAD for some weeks now, and all those
>>>>>> subtle differences are really a PITA and source of various breakages.
>>>>>> 
>>>>>> People interested in that platform should finally realize that its fate
>>>>>> is coupled to reducing the #ifdefs as well as the design differences we
>>>>>> see right now and even more in the future.
>>>>>> 
>>>>> The guilty here is IOTHREAD. Windows support predates IOTHREAD concept,
>>>>> it's just that people who introduce IOTHREAD didn't care about Windows
>>>>> support at all and added these #ifdef. Disabling Windows support because
>>>>> of that is not fair.
>>>> The TCG execution model won't scale long-term. It's already a main to
>>>> boot a quad or just dual core VM, even more when your host has at least
>>>> as many real cores. I'm sure we'll see multi-threaded TCG CPUs in the
>>>> future, and the iothread will just be one of 7, 17 or 257 threads.
>>>> 
>>> And what's the issue with that? People don't always look for performance
>>> when using QEMU. They even often try to emulate old machines (and non
>>> x86 ones), which anyway only have one CPU. This won't change in 5 years,
>>> the only thing is that those machines will be 5 years older.
>>> 
>>> People have to keep in mind that QEMU doesn't mean only virtualization
>>> and doesn't mean only x86.
>> 
>> I'm not talking about virtualization here. I'm talking about usable
>> emulation of today's (!) embedded multi-core platforms. It matters a lot
>> if your test roundtrip for booting into a SMP guest and running some
>> apps is a few 10 seconds, a few minutes or even not practically working.
>> Ever tried to boot a 16 core VM in emulation mode? I did, for fun. I
>> just hope I'll never depend on this for work.
> 
> Yes, it's slow. But is it a problem? You assume that people use QEMU
> only for emulating SMP platforms. This is a wrong assumption. Beside the
> x86 target, only sparc really supports SMP emulation.

I guess his point here really is that soon SMP is commodity. Most new ARM cores move to SMP by default, MIPS is there already and even embedded PPC is multi-core for a while now. Sure, you can work around things by only emulating a single core at times, but it's not always good enough - especially if you're working on interrupt handling code.

Either way, the whole discussion is moot. We either do support Windows or we don't. Most of the developers don't even have windows machines, so it's very hard for them to do it - even less so do they have windows programming knowledge. So what we really need is for someone to implement the thread infrastructure and aio support on windows and then all is great (until the next big infrastructure feature of course).

If only the Android people wouldn't simply fork every project out there, but work upstream, we'd probably have quite a few folks happy to support windows from that crowd, as they depend on it heavily.


Alex
Stefan Hajnoczi - Feb. 8, 2011, 10:17 a.m.
Introducing IOTHREAD made !CONFIG_IOTHREAD platforms second class
citizens.  I think you'd like people to provide full support when they
introduce new features.

This is a good motivator to use glib and have a unified code path for
TCG/KVM and Linux/Windows.  Yes it will require some work and some
optimization, but at the end we'll have better host platform parity
and a simpler main loop for TCG/KVM to interact with.

Stefan
Jan Kiszka - Feb. 8, 2011, 10:21 a.m.
On 2011-02-08 11:06, Aurelien Jarno wrote:
> Jan Kiszka a écrit :
>> On 2011-02-08 10:58, Aurelien Jarno wrote:
>>> Jan Kiszka a écrit :
>>>> On 2011-02-08 10:05, Aurelien Jarno wrote:
>>>>> Jan Kiszka a écrit :
>>>>>> On 2011-02-08 09:08, Paolo Bonzini wrote:
>>>>>>> On 02/08/2011 08:26 AM, Aurelien Jarno wrote:
>>>>>>>> I forget to remember when we decided that AIO should be implemented on
>>>>>>>> any host OS. Any pointer?
>>>>>>> To be fair, I/O-heavy workloads are almost unusable without AIO.  For 
>>>>>>> Window targets, they also crash under SMP due to the Windows AP 
>>>>>>> watchdog.  But then TCG and SMP do not go very well together anyway.
>>>>>>>
>>>>>>> However, I think deprecating Win32 support would be a very bad idea.
>>>>>> It would be too early at this point.
>>>>>>
>>>>>> But if Windows is once the only reason to keep tons of hardly tested
>>>>>> code paths around or to invest significant additional effort to change
>>>>>> logic or interfaces in this area, than I would prefer that step. I'm
>>>>>> hacking on IOTHREAD vs. !IOTHREAD for some weeks now, and all those
>>>>>> subtle differences are really a PITA and source of various breakages.
>>>>>>
>>>>>> People interested in that platform should finally realize that its fate
>>>>>> is coupled to reducing the #ifdefs as well as the design differences we
>>>>>> see right now and even more in the future.
>>>>>>
>>>>> The guilty here is IOTHREAD. Windows support predates IOTHREAD concept,
>>>>> it's just that people who introduce IOTHREAD didn't care about Windows
>>>>> support at all and added these #ifdef. Disabling Windows support because
>>>>> of that is not fair.
>>>> The TCG execution model won't scale long-term. It's already a main to
>>>> boot a quad or just dual core VM, even more when your host has at least
>>>> as many real cores. I'm sure we'll see multi-threaded TCG CPUs in the
>>>> future, and the iothread will just be one of 7, 17 or 257 threads.
>>>>
>>> And what's the issue with that? People don't always look for performance
>>> when using QEMU. They even often try to emulate old machines (and non
>>> x86 ones), which anyway only have one CPU. This won't change in 5 years,
>>> the only thing is that those machines will be 5 years older.
>>>
>>> People have to keep in mind that QEMU doesn't mean only virtualization
>>> and doesn't mean only x86.
>>
>> I'm not talking about virtualization here. I'm talking about usable
>> emulation of today's (!) embedded multi-core platforms. It matters a lot
>> if your test roundtrip for booting into a SMP guest and running some
>> apps is a few 10 seconds, a few minutes or even not practically working.
>> Ever tried to boot a 16 core VM in emulation mode? I did, for fun. I
>> just hope I'll never depend on this for work.
> 
> Yes, it's slow. But is it a problem? You assume that people use QEMU
> only for emulating SMP platforms. This is a wrong assumption. Beside the
> x86 target, only sparc really supports SMP emulation.

That's too nearsighted. SMP will be commodity on practically _any_ arch
within the next years. And if QEMU doesn't keep up with it, feature and
performance-wise, it will loose market share.

Jan
Aurelien Jarno - Feb. 8, 2011, 10:26 a.m.
Jan Kiszka a écrit :
> On 2011-02-08 11:06, Aurelien Jarno wrote:
>> Jan Kiszka a écrit :
>>> On 2011-02-08 10:58, Aurelien Jarno wrote:
>>>> Jan Kiszka a écrit :
>>>>> On 2011-02-08 10:05, Aurelien Jarno wrote:
>>>>>> Jan Kiszka a écrit :
>>>>>>> On 2011-02-08 09:08, Paolo Bonzini wrote:
>>>>>>>> On 02/08/2011 08:26 AM, Aurelien Jarno wrote:
>>>>>>>>> I forget to remember when we decided that AIO should be implemented on
>>>>>>>>> any host OS. Any pointer?
>>>>>>>> To be fair, I/O-heavy workloads are almost unusable without AIO.  For 
>>>>>>>> Window targets, they also crash under SMP due to the Windows AP 
>>>>>>>> watchdog.  But then TCG and SMP do not go very well together anyway.
>>>>>>>>
>>>>>>>> However, I think deprecating Win32 support would be a very bad idea.
>>>>>>> It would be too early at this point.
>>>>>>>
>>>>>>> But if Windows is once the only reason to keep tons of hardly tested
>>>>>>> code paths around or to invest significant additional effort to change
>>>>>>> logic or interfaces in this area, than I would prefer that step. I'm
>>>>>>> hacking on IOTHREAD vs. !IOTHREAD for some weeks now, and all those
>>>>>>> subtle differences are really a PITA and source of various breakages.
>>>>>>>
>>>>>>> People interested in that platform should finally realize that its fate
>>>>>>> is coupled to reducing the #ifdefs as well as the design differences we
>>>>>>> see right now and even more in the future.
>>>>>>>
>>>>>> The guilty here is IOTHREAD. Windows support predates IOTHREAD concept,
>>>>>> it's just that people who introduce IOTHREAD didn't care about Windows
>>>>>> support at all and added these #ifdef. Disabling Windows support because
>>>>>> of that is not fair.
>>>>> The TCG execution model won't scale long-term. It's already a main to
>>>>> boot a quad or just dual core VM, even more when your host has at least
>>>>> as many real cores. I'm sure we'll see multi-threaded TCG CPUs in the
>>>>> future, and the iothread will just be one of 7, 17 or 257 threads.
>>>>>
>>>> And what's the issue with that? People don't always look for performance
>>>> when using QEMU. They even often try to emulate old machines (and non
>>>> x86 ones), which anyway only have one CPU. This won't change in 5 years,
>>>> the only thing is that those machines will be 5 years older.
>>>>
>>>> People have to keep in mind that QEMU doesn't mean only virtualization
>>>> and doesn't mean only x86.
>>> I'm not talking about virtualization here. I'm talking about usable
>>> emulation of today's (!) embedded multi-core platforms. It matters a lot
>>> if your test roundtrip for booting into a SMP guest and running some
>>> apps is a few 10 seconds, a few minutes or even not practically working.
>>> Ever tried to boot a 16 core VM in emulation mode? I did, for fun. I
>>> just hope I'll never depend on this for work.
>> Yes, it's slow. But is it a problem? You assume that people use QEMU
>> only for emulating SMP platforms. This is a wrong assumption. Beside the
>> x86 target, only sparc really supports SMP emulation.
> 
> That's too nearsighted. SMP will be commodity on practically _any_ arch
> within the next years. And if QEMU doesn't keep up with it, feature and
> performance-wise, it will loose market share.
> 

Oh commercial arguments now. I am looking for something that answer my
needs, not about market share.
Aurelien Jarno - Feb. 8, 2011, 10:27 a.m.
Stefan Hajnoczi a écrit :
> Introducing IOTHREAD made !CONFIG_IOTHREAD platforms second class
> citizens.  I think you'd like people to provide full support when they
> introduce new features.
> 

I think you really pointed the problem here. We should probably add a
feature that will make KVM second class citizen so that people can
understand what it means.
Jan Kiszka - Feb. 8, 2011, 10:30 a.m.
On 2011-02-08 11:26, Aurelien Jarno wrote:
> Jan Kiszka a écrit :
>> On 2011-02-08 11:06, Aurelien Jarno wrote:
>>> Jan Kiszka a écrit :
>>>> On 2011-02-08 10:58, Aurelien Jarno wrote:
>>>>> Jan Kiszka a écrit :
>>>>>> On 2011-02-08 10:05, Aurelien Jarno wrote:
>>>>>>> Jan Kiszka a écrit :
>>>>>>>> On 2011-02-08 09:08, Paolo Bonzini wrote:
>>>>>>>>> On 02/08/2011 08:26 AM, Aurelien Jarno wrote:
>>>>>>>>>> I forget to remember when we decided that AIO should be implemented on
>>>>>>>>>> any host OS. Any pointer?
>>>>>>>>> To be fair, I/O-heavy workloads are almost unusable without AIO.  For 
>>>>>>>>> Window targets, they also crash under SMP due to the Windows AP 
>>>>>>>>> watchdog.  But then TCG and SMP do not go very well together anyway.
>>>>>>>>>
>>>>>>>>> However, I think deprecating Win32 support would be a very bad idea.
>>>>>>>> It would be too early at this point.
>>>>>>>>
>>>>>>>> But if Windows is once the only reason to keep tons of hardly tested
>>>>>>>> code paths around or to invest significant additional effort to change
>>>>>>>> logic or interfaces in this area, than I would prefer that step. I'm
>>>>>>>> hacking on IOTHREAD vs. !IOTHREAD for some weeks now, and all those
>>>>>>>> subtle differences are really a PITA and source of various breakages.
>>>>>>>>
>>>>>>>> People interested in that platform should finally realize that its fate
>>>>>>>> is coupled to reducing the #ifdefs as well as the design differences we
>>>>>>>> see right now and even more in the future.
>>>>>>>>
>>>>>>> The guilty here is IOTHREAD. Windows support predates IOTHREAD concept,
>>>>>>> it's just that people who introduce IOTHREAD didn't care about Windows
>>>>>>> support at all and added these #ifdef. Disabling Windows support because
>>>>>>> of that is not fair.
>>>>>> The TCG execution model won't scale long-term. It's already a main to
>>>>>> boot a quad or just dual core VM, even more when your host has at least
>>>>>> as many real cores. I'm sure we'll see multi-threaded TCG CPUs in the
>>>>>> future, and the iothread will just be one of 7, 17 or 257 threads.
>>>>>>
>>>>> And what's the issue with that? People don't always look for performance
>>>>> when using QEMU. They even often try to emulate old machines (and non
>>>>> x86 ones), which anyway only have one CPU. This won't change in 5 years,
>>>>> the only thing is that those machines will be 5 years older.
>>>>>
>>>>> People have to keep in mind that QEMU doesn't mean only virtualization
>>>>> and doesn't mean only x86.
>>>> I'm not talking about virtualization here. I'm talking about usable
>>>> emulation of today's (!) embedded multi-core platforms. It matters a lot
>>>> if your test roundtrip for booting into a SMP guest and running some
>>>> apps is a few 10 seconds, a few minutes or even not practically working.
>>>> Ever tried to boot a 16 core VM in emulation mode? I did, for fun. I
>>>> just hope I'll never depend on this for work.
>>> Yes, it's slow. But is it a problem? You assume that people use QEMU
>>> only for emulating SMP platforms. This is a wrong assumption. Beside the
>>> x86 target, only sparc really supports SMP emulation.
>>
>> That's too nearsighted. SMP will be commodity on practically _any_ arch
>> within the next years. And if QEMU doesn't keep up with it, feature and
>> performance-wise, it will loose market share.
>>
> 
> Oh commercial arguments now. I am looking for something that answer my
> needs, not about market share.
> 

"Market share" simply means user base, for commercial or for hobby,
academic, whatever use. QEMU has a nice position here ATM. Even
commercial competitors can help continuously comparing their solutions
with QEMU (I once enjoyed such a product presentation). However, time
does not stand still.

Jan
Paolo Bonzini - Feb. 8, 2011, 10:31 a.m.
On 02/08/2011 11:27 AM, Aurelien Jarno wrote:
> Stefan Hajnoczi a écrit :
>> Introducing IOTHREAD made !CONFIG_IOTHREAD platforms second class
>> citizens.  I think you'd like people to provide full support when they
>> introduce new features.
>
> I think you really pointed the problem here. We should probably add a
> feature that will make KVM second class citizen so that people can
> understand what it means.

I actually don't think introducing IOTHREAD made Windows a second class 
citizen, since it was left as a non-default choice for years.  People 
care about IOTHREAD now only because (after years) there is serious 
thought about making it the default.

I'm sure that if you add such a killer feature that is TCG-only, KVM 
people will try to support it in a shorter timeframe.

Paolo
Jan Kiszka - Feb. 8, 2011, 10:40 a.m.
On 2011-02-08 11:27, Aurelien Jarno wrote:
> Stefan Hajnoczi a écrit :
>> Introducing IOTHREAD made !CONFIG_IOTHREAD platforms second class
>> citizens.  I think you'd like people to provide full support when they
>> introduce new features.
>>
> 
> I think you really pointed the problem here. We should probably add a
> feature that will make KVM second class citizen so that people can
> understand what it means.

There are people out there who already thought loudly about forking or
rewriting those QEMU bits required for KVM support just to make "life
easier". I already disagreed on this, and I continue to do so as both
use cases nicely benefit from each other.

KVM is driving QEMU features today that would otherwise have taken years
to show up, if at all. On the other side, all those bits related to the
cross-arch platform emulation of non-x86 helps and will continue to help
KVM support on those archs as well (we already have it on PPC, we'll see
on ARM and likely more in the future).

So, please let's stop this useless finger pointing, on both sides. KVM
and QEMU is a symbiosis. Unfortunately, this is not (yet?) the case for
POSIX vs. Windows hosts.

Jan
Tristan Gingold - Feb. 8, 2011, 11:07 a.m.
On Feb 8, 2011, at 6:58 PM, Anthony Liguori wrote:

> On 02/08/2011 04:06 AM, Aurelien Jarno wrote:
>> Yes, it's slow. But is it a problem? You assume that people use QEMU
>> only for emulating SMP platforms. This is a wrong assumption. Beside the
>> x86 target, only sparc really supports SMP emulation.
>>   
> 
> It's *not* just about performance.
> 
> TCG requires a signal to break out of a tight chained TB loop.  If you have a guest in a tight loop waiting for something external (like polling on a in-memory flag), the device emulation will not get to run until a signal is fired.
> 
> Unless you set SIGIO on every file descriptor that selects polls on (and you can't because there are a number that just don't support SIGIO), then you have a race condition.

A race condition ?  Looks like you are describing a dead-lock.

But the dead lock doesn't happen because of the timer which periodically exits from TCG.  Hence the performance issue.

> This can be fixed by running TCG in a separate thread than select() and sending a signal to the TCG VCPU when select() returns (effectively SIGIO in userspace).
> 
> This is exactly what the I/O thread does.


(Nobody was able to make it working on Windows - or nobody was interested in ?)

Tristan.
Aurelien Jarno - Feb. 8, 2011, 11:15 a.m.
Anthony Liguori a écrit :
> On 02/08/2011 04:06 AM, Aurelien Jarno wrote:
>> Yes, it's slow. But is it a problem? You assume that people use QEMU
>> only for emulating SMP platforms. This is a wrong assumption. Beside the
>> x86 target, only sparc really supports SMP emulation.
>>    
> 
> It's *not* just about performance.
> 
> TCG requires a signal to break out of a tight chained TB loop.  If you 
> have a guest in a tight loop waiting for something external (like 
> polling on a in-memory flag), the device emulation will not get to run 
> until a signal is fired.
> 
> Unless you set SIGIO on every file descriptor that selects polls on (and 
> you can't because there are a number that just don't support SIGIO), 
> then you have a race condition.
> 

In practice you will get a signal when the next timer event expire. I
agree it's suboptimal, but it works, and has been like that for here.

Having that fixed through an I/O thread is actually quite nice, however
it should not be done ignoring all the *current* drawbacks of the
iothread mode. We know them (at least for some of them), so let's try to
solve them.

And now, I don't buy the argument "it's been there for years", it was
*disabled* by default.
Aurelien Jarno - Feb. 8, 2011, 11:29 a.m.
Anthony Liguori a écrit :
> On 02/08/2011 04:27 AM, Aurelien Jarno wrote:
>> Stefan Hajnoczi a écrit :
>>    
>>> Introducing IOTHREAD made !CONFIG_IOTHREAD platforms second class
>>> citizens.  I think you'd like people to provide full support when they
>>> introduce new features.
>>>
>>>      
>> I think you really pointed the problem here. We should probably add a
>> feature that will make KVM second class citizen so that people can
>> understand what it means.
>>    
> 
> Aurelien,
> 
> Have you actually run QEMU on Windows and tried to use it to do 
> something useful?
> 
> As an exercise, walk through the various releases of QEMU and compare 
> how well it works on Windows to any Unix platform.  Windows support in 
> QEMU has always been a second class citizen.

I never tried to get it working on windows, but I know some people using
it there. We should just don't ignore them. Maybe it's not perfect, but
it is enough for those people.

> If someone is willing to stand up and properly maintain it, I'm all for 
> doing whatever we can to be supportive of that person but as of right 
> now, that doesn't exist.

There are regular patches for windows support, Stefan Weil is producing
builds regularly. Maybe it doesn't have all the features, but people are
making sure it basically works.

Now you want to break that because the *new* feature you want to
introduce is not supported on windows. I insist on the fact it is a new
feature simply because it was *disabled* by default. So I don't buy the
argument about "that person don't exist".

Send a call for help on that subject, give the people some time to come
with a solution (let's put a deadline), and then if nobody appears we
can definitely consider Windows support as dead. But it should not be
done arbitrary.
Aurelien Jarno - Feb. 8, 2011, 11:46 a.m.
On Tue, Feb 08, 2011 at 12:07:02PM +0100, Tristan Gingold wrote:
> 
> On Feb 8, 2011, at 6:58 PM, Anthony Liguori wrote:
> 
> > On 02/08/2011 04:06 AM, Aurelien Jarno wrote:
> >> Yes, it's slow. But is it a problem? You assume that people use QEMU
> >> only for emulating SMP platforms. This is a wrong assumption. Beside the
> >> x86 target, only sparc really supports SMP emulation.
> >>   
> > 
> > It's *not* just about performance.
> > 
> > TCG requires a signal to break out of a tight chained TB loop.  If you have a guest in a tight loop waiting for something external (like polling on a in-memory flag), the device emulation will not get to run until a signal is fired.
> > 
> > Unless you set SIGIO on every file descriptor that selects polls on (and you can't because there are a number that just don't support SIGIO), then you have a race condition.
> 
> A race condition ?  Looks like you are describing a dead-lock.
> 
> But the dead lock doesn't happen because of the timer which periodically exits from TCG.  Hence the performance issue.
> 
> > This can be fixed by running TCG in a separate thread than select() and sending a signal to the TCG VCPU when select() returns (effectively SIGIO in userspace).
> > 
> > This is exactly what the I/O thread does.
> 
> 
> (Nobody was able to make it working on Windows - or nobody was interested in ?)
> 
Given the I/O thread is disabled by default, my guess is that nobody 
really see an interest in looking at that.
Paolo Bonzini - Feb. 8, 2011, 12:07 p.m.
On 02/08/2011 12:46 PM, Aurelien Jarno wrote:
> Given the I/O thread is disabled by default, my guess is that nobody
> really see an interest in looking at that.

I had started looking at it in my free time.  I stopped because the 
thread pool series were continuously changing the QemuThread APIs.  I 
can resume looking at it.

Paolo
Paolo Bonzini - Feb. 8, 2011, 12:10 p.m.
On 02/08/2011 12:15 PM, Aurelien Jarno wrote:
> however
> it should not be done ignoring all the*current*  drawbacks of the
> iothread mode. We know them (at least for some of them), so let's try to
> solve them.

Let's also enumerate them.

> And now, I don't buy the argument "it's been there for years", it was
> *disabled*  by default.

It was disabled by default only because it is most useful for KVM and 
people were using qemu-kvm's iothread.

Paolo
Riku Voipio - Feb. 8, 2011, 12:38 p.m.
On Tue, Feb 08, 2011 at 12:05:31PM -0600, Anthony Liguori wrote:
> Aurelien,

> Have you actually run QEMU on Windows and tried to use it to do  
> something useful?

I'm not Aurelian, but we do use QEMU on win32 as part of Nokia Qt SDK.
While it is second class in many ways compared to Linux QEMU (or even
OS X), it is still quite useful for us.

Riku
Aurelien Jarno - Feb. 8, 2011, 1:30 p.m.
Anthony Liguori a écrit :
> On 02/08/2011 05:15 AM, Aurelien Jarno wrote:
>> Anthony Liguori a écrit :
>>    
>>> On 02/08/2011 04:06 AM, Aurelien Jarno wrote:
>>>      
>>>> Yes, it's slow. But is it a problem? You assume that people use QEMU
>>>> only for emulating SMP platforms. This is a wrong assumption. Beside the
>>>> x86 target, only sparc really supports SMP emulation.
>>>>
>>>>        
>>> It's *not* just about performance.
>>>
>>> TCG requires a signal to break out of a tight chained TB loop.  If you
>>> have a guest in a tight loop waiting for something external (like
>>> polling on a in-memory flag), the device emulation will not get to run
>>> until a signal is fired.
>>>
>>> Unless you set SIGIO on every file descriptor that selects polls on (and
>>> you can't because there are a number that just don't support SIGIO),
>>> then you have a race condition.
>>>
>>>      
>> In practice you will get a signal when the next timer event expire. I
>> agree it's suboptimal, but it works, and has been like that for here.
>>    
> 
> During early boot up before the periodic timer is enabled can cause 
> quite a noticable issue here.
> 
> I think it's cris specifically that does polling I/O in the early 
> startup before any periodic timer is enabled.
> 
>> Having that fixed through an I/O thread is actually quite nice, however
>> it should not be done ignoring all the *current* drawbacks of the
>> iothread mode. We know them (at least for some of them), so let's try to
>> solve them.
>>    
> 
> Yes, agree 100%.
> 
>> And now, I don't buy the argument "it's been there for years", it was
>> *disabled* by default.
>>    
> 
> Yeah, I think we need to enable it by default and commit to fixing all 
> of the outstanding issues.

So the strategy is let's break everything and wait for the maintainer to
fix that? This strategy doesn't work, we have seen for example that with
the SeaBIOS switch. While it brings nice features, it has broken the
isapc machine. And it's still not fixed...

Also this strategy doesn't scale, then the maintainers are spending
their time fixing bugs introduced because others didn't care. Resources
are not unlimited, especially for those doing that on their free time.

> I think we've fixed all that we're aware of but we probably won't find 
> the rest unless we enable it universally.

I agree that we are going to discover bugs, and it's normal. QEMU is
quite complex and it's not possible to test every combination. That said
we are already aware of some bugs, why not fix them, or at least try to
fix them? For example we haven't fixed the performance regression with
TCG (at least it wasn't the case two weeks ago).
Aurelien Jarno - Feb. 8, 2011, 1:31 p.m.
Paolo Bonzini a écrit :
> On 02/08/2011 12:15 PM, Aurelien Jarno wrote:
>> however
>> it should not be done ignoring all the*current*  drawbacks of the
>> iothread mode. We know them (at least for some of them), so let's try to
>> solve them.
> 
> Let's also enumerate them.
> 

From what I know:
- performance regression in TCG mode
- windows support

I am going to look again at the first one tonight to provide some numbers.
Aurelien Jarno - Feb. 8, 2011, 3:08 p.m.
Aurelien Jarno a écrit :
> Paolo Bonzini a écrit :
>> On 02/08/2011 12:15 PM, Aurelien Jarno wrote:
>>> however
>>> it should not be done ignoring all the*current*  drawbacks of the
>>> iothread mode. We know them (at least for some of them), so let's try to
>>> solve them.
>> Let's also enumerate them.
>>
> 
> From what I know:
> - performance regression in TCG mode

I setup an x86_64 guest on an x86_64 host (Intel Xeon E5345). Nothing
was running except the standard daemons and the CPU governor was set to
"performance" on all CPU. I then compared the network performance using
netperf in default mode, through a tap interface and a virtio nic. I got
the following results (quite reproducible, std below 0.5):
- without IO thread: 107.36 MB/s
- with IO thread:     89.93 MB/s

I haven't redone the tests I have done two weeks ago on MIPS, ARM,
PowerPC and SH4 (using different emulated network cards: smc91c111,
rtl8139, e1000, virtio), but it was roughly the same slow down, except
on ARM where it was more important.
Aurelien Jarno - Feb. 8, 2011, 3:09 p.m.
Anthony Liguori a écrit :
> On 02/08/2011 07:30 AM, Aurelien Jarno wrote:
>> So the strategy is let's break everything and wait for the maintainer to
>> fix that? This strategy doesn't work, we have seen for example that with
>> the SeaBIOS switch. While it brings nice features, it has broken the
>> isapc machine. And it's still not fixed...
>>    
> 
> The fundamental problem is that poorly thought out features have been 
> committed in the past.  isapc is a good example of this.
> 
> You can't just remove a chipset but leave an ISA bus implementation and 
> expect things to just keep working.  Even the early ISA-only systems had 
> a chipset that firmware interfaced with.
> 
>> Also this strategy doesn't scale, then the maintainers are spending
>> their time fixing bugs introduced because others didn't care. Resources
>> are not unlimited, especially for those doing that on their free time.
>>    
> 
> So are you suggesting that every half baked feature should hold up any 
> other future developments?  I think the real problem is exactly the 
> opposite of what you describe.  Why should we waste finite resources 
> keeping something like Windows support limping along?
> 
> We need to do a better job of not adding features that there is no 
> serious intention of every supporting in a meaningful way.  I think the 
> recent discussion of w64 is a good example of this.  I can't imagine 
> trying to support w64 in QEMU until someone actually makes w32 work in a 
> reasonable way.

Yes, we should at least leave people time to find a solution. If nobody
comes with a solution, let's consider it deprecated.

>>> I think we've fixed all that we're aware of but we probably won't find
>>> the rest unless we enable it universally.
>>>      
>> I agree that we are going to discover bugs, and it's normal. QEMU is
>> quite complex and it's not possible to test every combination. That said
>> we are already aware of some bugs, why not fix them, or at least try to
>> fix them? For example we haven't fixed the performance regression with
>> TCG (at least it wasn't the case two weeks ago).
>>    
> 
> If there are known issues, yes, let's fix them before enabling it.
> 

So please look at this TCG performance regression instead of talking
about enabling this just after the release. I don't consider TCG a half
baked feature, for people who forgot about that it's the original QEMU mode.
Anthony Liguori - Feb. 8, 2011, 5:58 p.m.
On 02/08/2011 04:06 AM, Aurelien Jarno wrote:
> Yes, it's slow. But is it a problem? You assume that people use QEMU
> only for emulating SMP platforms. This is a wrong assumption. Beside the
> x86 target, only sparc really supports SMP emulation.
>    

It's *not* just about performance.

TCG requires a signal to break out of a tight chained TB loop.  If you 
have a guest in a tight loop waiting for something external (like 
polling on a in-memory flag), the device emulation will not get to run 
until a signal is fired.

Unless you set SIGIO on every file descriptor that selects polls on (and 
you can't because there are a number that just don't support SIGIO), 
then you have a race condition.

This can be fixed by running TCG in a separate thread than select() and 
sending a signal to the TCG VCPU when select() returns (effectively 
SIGIO in userspace).

This is exactly what the I/O thread does.

Regards,

Anthony Liguori
Anthony Liguori - Feb. 8, 2011, 6:05 p.m.
On 02/08/2011 04:27 AM, Aurelien Jarno wrote:
> Stefan Hajnoczi a écrit :
>    
>> Introducing IOTHREAD made !CONFIG_IOTHREAD platforms second class
>> citizens.  I think you'd like people to provide full support when they
>> introduce new features.
>>
>>      
> I think you really pointed the problem here. We should probably add a
> feature that will make KVM second class citizen so that people can
> understand what it means.
>    

Aurelien,

Have you actually run QEMU on Windows and tried to use it to do 
something useful?

As an exercise, walk through the various releases of QEMU and compare 
how well it works on Windows to any Unix platform.  Windows support in 
QEMU has always been a second class citizen.

If someone is willing to stand up and properly maintain it, I'm all for 
doing whatever we can to be supportive of that person but as of right 
now, that doesn't exist.

Regards,

Anthony Liguori
Anthony Liguori - Feb. 8, 2011, 7:17 p.m.
On 02/08/2011 05:15 AM, Aurelien Jarno wrote:
> Anthony Liguori a écrit :
>    
>> On 02/08/2011 04:06 AM, Aurelien Jarno wrote:
>>      
>>> Yes, it's slow. But is it a problem? You assume that people use QEMU
>>> only for emulating SMP platforms. This is a wrong assumption. Beside the
>>> x86 target, only sparc really supports SMP emulation.
>>>
>>>        
>> It's *not* just about performance.
>>
>> TCG requires a signal to break out of a tight chained TB loop.  If you
>> have a guest in a tight loop waiting for something external (like
>> polling on a in-memory flag), the device emulation will not get to run
>> until a signal is fired.
>>
>> Unless you set SIGIO on every file descriptor that selects polls on (and
>> you can't because there are a number that just don't support SIGIO),
>> then you have a race condition.
>>
>>      
> In practice you will get a signal when the next timer event expire. I
> agree it's suboptimal, but it works, and has been like that for here.
>    

During early boot up before the periodic timer is enabled can cause 
quite a noticable issue here.

I think it's cris specifically that does polling I/O in the early 
startup before any periodic timer is enabled.

> Having that fixed through an I/O thread is actually quite nice, however
> it should not be done ignoring all the *current* drawbacks of the
> iothread mode. We know them (at least for some of them), so let's try to
> solve them.
>    

Yes, agree 100%.

> And now, I don't buy the argument "it's been there for years", it was
> *disabled* by default.
>    

Yeah, I think we need to enable it by default and commit to fixing all 
of the outstanding issues.

I think we've fixed all that we're aware of but we probably won't find 
the rest unless we enable it universally.

Regards,

Anthony Liguori
Anthony Liguori - Feb. 8, 2011, 7:21 p.m.
On 02/08/2011 05:46 AM, Aurelien Jarno wrote:
> On Tue, Feb 08, 2011 at 12:07:02PM +0100, Tristan Gingold wrote:
>    
>> On Feb 8, 2011, at 6:58 PM, Anthony Liguori wrote:
>>
>>      
>>> On 02/08/2011 04:06 AM, Aurelien Jarno wrote:
>>>        
>>>> Yes, it's slow. But is it a problem? You assume that people use QEMU
>>>> only for emulating SMP platforms. This is a wrong assumption. Beside the
>>>> x86 target, only sparc really supports SMP emulation.
>>>>
>>>>          
>>> It's *not* just about performance.
>>>
>>> TCG requires a signal to break out of a tight chained TB loop.  If you have a guest in a tight loop waiting for something external (like polling on a in-memory flag), the device emulation will not get to run until a signal is fired.
>>>
>>> Unless you set SIGIO on every file descriptor that selects polls on (and you can't because there are a number that just don't support SIGIO), then you have a race condition.
>>>        
>> A race condition ?  Looks like you are describing a dead-lock.
>>
>> But the dead lock doesn't happen because of the timer which periodically exits from TCG.  Hence the performance issue.
>>      

With dynticks, you don't always have a periodic timer (unless the guest 
has a periodic timer enabled).  There's a good bit of early startup code 
that runs without a periodic timer enabled.

Now that said, we never truly sleep forever.  We'll set something like a 
5 second timeout.  But 5 seconds might as well be forever and this is 
certainly a giant hack.

Regards,

Anthony Liguori

>>> This can be fixed by running TCG in a separate thread than select() and sending a signal to the TCG VCPU when select() returns (effectively SIGIO in userspace).
>>>
>>> This is exactly what the I/O thread does.
>>>        
>>
>> (Nobody was able to make it working on Windows - or nobody was interested in ?)
>>
>>      
> Given the I/O thread is disabled by default, my guess is that nobody
> really see an interest in looking at that.
>
>
Anthony Liguori - Feb. 8, 2011, 8:54 p.m.
On 02/08/2011 07:30 AM, Aurelien Jarno wrote:
> So the strategy is let's break everything and wait for the maintainer to
> fix that? This strategy doesn't work, we have seen for example that with
> the SeaBIOS switch. While it brings nice features, it has broken the
> isapc machine. And it's still not fixed...
>    

The fundamental problem is that poorly thought out features have been 
committed in the past.  isapc is a good example of this.

You can't just remove a chipset but leave an ISA bus implementation and 
expect things to just keep working.  Even the early ISA-only systems had 
a chipset that firmware interfaced with.

> Also this strategy doesn't scale, then the maintainers are spending
> their time fixing bugs introduced because others didn't care. Resources
> are not unlimited, especially for those doing that on their free time.
>    

So are you suggesting that every half baked feature should hold up any 
other future developments?  I think the real problem is exactly the 
opposite of what you describe.  Why should we waste finite resources 
keeping something like Windows support limping along?

We need to do a better job of not adding features that there is no 
serious intention of every supporting in a meaningful way.  I think the 
recent discussion of w64 is a good example of this.  I can't imagine 
trying to support w64 in QEMU until someone actually makes w32 work in a 
reasonable way.

>> I think we've fixed all that we're aware of but we probably won't find
>> the rest unless we enable it universally.
>>      
> I agree that we are going to discover bugs, and it's normal. QEMU is
> quite complex and it's not possible to test every combination. That said
> we are already aware of some bugs, why not fix them, or at least try to
> fix them? For example we haven't fixed the performance regression with
> TCG (at least it wasn't the case two weeks ago).
>    

If there are known issues, yes, let's fix them before enabling it.

Regards,

Anthony Liguori
Blue Swirl - Feb. 9, 2011, 5:13 p.m.
On Tue, Feb 8, 2011 at 5:09 PM, Aurelien Jarno <aurelien@aurel32.net> wrote:
> Anthony Liguori a écrit :
>> On 02/08/2011 07:30 AM, Aurelien Jarno wrote:
>>> So the strategy is let's break everything and wait for the maintainer to
>>> fix that? This strategy doesn't work, we have seen for example that with
>>> the SeaBIOS switch. While it brings nice features, it has broken the
>>> isapc machine. And it's still not fixed...
>>>
>>
>> The fundamental problem is that poorly thought out features have been
>> committed in the past.  isapc is a good example of this.
>>
>> You can't just remove a chipset but leave an ISA bus implementation and
>> expect things to just keep working.  Even the early ISA-only systems had
>> a chipset that firmware interfaced with.
>>
>>> Also this strategy doesn't scale, then the maintainers are spending
>>> their time fixing bugs introduced because others didn't care. Resources
>>> are not unlimited, especially for those doing that on their free time.
>>>
>>
>> So are you suggesting that every half baked feature should hold up any
>> other future developments?  I think the real problem is exactly the
>> opposite of what you describe.  Why should we waste finite resources
>> keeping something like Windows support limping along?
>>
>> We need to do a better job of not adding features that there is no
>> serious intention of every supporting in a meaningful way.  I think the
>> recent discussion of w64 is a good example of this.  I can't imagine
>> trying to support w64 in QEMU until someone actually makes w32 work in a
>> reasonable way.
>
> Yes, we should at least leave people time to find a solution. If nobody
> comes with a solution, let's consider it deprecated.

I think win32 situation is somewhat similar (but not nearly as bad as)
to kqemu's. It was useful for some users, but there were no
maintenance and when it got in the way, it was removed because nobody
could fix it.

But I'd prefer a solution where somebody steps up as Windows
maintainer. I'm also doing regular mingw32 builds but otherwise not
much.
Aurelien Jarno - Feb. 9, 2011, 5:35 p.m.
On Tue, Feb 08, 2011 at 04:08:28PM +0100, Aurelien Jarno wrote:
> Aurelien Jarno a écrit :
> > Paolo Bonzini a écrit :
> >> On 02/08/2011 12:15 PM, Aurelien Jarno wrote:
> >>> however
> >>> it should not be done ignoring all the*current*  drawbacks of the
> >>> iothread mode. We know them (at least for some of them), so let's try to
> >>> solve them.
> >> Let's also enumerate them.
> >>
> > 
> > From what I know:
> > - performance regression in TCG mode
> 
> I setup an x86_64 guest on an x86_64 host (Intel Xeon E5345). Nothing
> was running except the standard daemons and the CPU governor was set to
> "performance" on all CPU. I then compared the network performance using
> netperf in default mode, through a tap interface and a virtio nic. I got
> the following results (quite reproducible, std below 0.5):
> - without IO thread: 107.36 MB/s
> - with IO thread:     89.93 MB/s
> 

And the same test on the code from september 2009:
- without IO thread: 141.8 MB/s
Anthony Liguori - Feb. 9, 2011, 8:07 p.m.
On 02/09/2011 06:35 PM, Aurelien Jarno wrote:
> On Tue, Feb 08, 2011 at 04:08:28PM +0100, Aurelien Jarno wrote:
>    
>> Aurelien Jarno a écrit :
>>      
>>> Paolo Bonzini a écrit :
>>>        
>>>> On 02/08/2011 12:15 PM, Aurelien Jarno wrote:
>>>>          
>>>>> however
>>>>> it should not be done ignoring all the*current*  drawbacks of the
>>>>> iothread mode. We know them (at least for some of them), so let's try to
>>>>> solve them.
>>>>>            
>>>> Let's also enumerate them.
>>>>
>>>>          
>>>  From what I know:
>>> - performance regression in TCG mode
>>>        
>> I setup an x86_64 guest on an x86_64 host (Intel Xeon E5345). Nothing
>> was running except the standard daemons and the CPU governor was set to
>> "performance" on all CPU. I then compared the network performance using
>> netperf in default mode, through a tap interface and a virtio nic. I got
>> the following results (quite reproducible, std below 0.5):
>> - without IO thread: 107.36 MB/s
>> - with IO thread:     89.93 MB/s
>>
>>      
> And the same test on the code from september 2009:
> - without IO thread: 141.8 MB/s
>    

virtio-net is super finicky regarding mitigation strategies and their 
relationship to the I/O thread.  Different benchmarks will behave 
differently.  virtio-blk is probably a better device to test as you'll 
get much more consistent results across different type of I/O patterns.

Regards,

Anthony Liguori
Stefan Weil - Feb. 9, 2011, 10:16 p.m.
Am 09.02.2011 18:13, schrieb Blue Swirl:
> I think win32 situation is somewhat similar (but not nearly as bad as)
> to kqemu's. It was useful for some users, but there were no
> maintenance and when it got in the way, it was removed because nobody
> could fix it.
>
> But I'd prefer a solution where somebody steps up as Windows
> maintainer. I'm also doing regular mingw32 builds but otherwise not
> much.
>

VNC threads can be compiled for W32, too.
A short test of the resulting executable was successful, no problems.

The patch is available here:
http://repo.or.cz/w/qemu/ar7.git/commitdiff/aabf11dc0a938b84d76d7c147cbf0445d7bee297

I decided to create a new directory structure hosts/w32, so files can
be moved from the root to hosts/posix, hosts/w32, or hosts/xxx.
Include chains reduce code modifications and conditional compilations.
And people who don't want to see w32 support can remove it easily :-)

Supporting I/O threads for W32 will be possible, too.

I don't think that W32 support is a big problem. It never was.
Some of the real problems were already named in the previous mails.

Regards,
Stefan Weil
Paolo Bonzini - Feb. 10, 2011, 7:34 a.m.
On 02/09/2011 11:16 PM, Stefan Weil wrote:
>
> I decided to create a new directory structure hosts/w32, so files can
> be moved from the root to hosts/posix, hosts/w32, or hosts/xxx.
> Include chains reduce code modifications and conditional compilations.
> And people who don't want to see w32 support can remove it easily :-)
>
> Supporting I/O threads for W32 will be possible, too.

I have patches for Win32 iothread, I'm just posting the series split 
into multiple pieces.

Paolo
Paolo Bonzini - Feb. 10, 2011, 9:54 a.m.
On 02/09/2011 11:16 PM, Stefan Weil wrote:
> The patch is available here:
> http://repo.or.cz/w/qemu/ar7.git/commitdiff/aabf11dc0a938b84d76d7c147cbf0445d7bee297

> diff --git a/hosts/w32/include/signal.h b/hosts/w32/include/signal.h
> new file mode 100644
> index 0000000..e45f03c
> --- /dev/null
> +++ b/hosts/w32/include/signal.h
> @@ -0,0 +1,20 @@
> +/*
> + * QEMU w32 support
> + *
> + * Copyright (C) 2011 Stefan Weil
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#ifndef WIN32_SIGNAL_H
> +#define WIN32_SIGNAL_H
> +
> +#include_next <signal.h>
> +#include <sys/types.h>    /* sigset_t */
> +
> +int pthread_sigmask(int how, const sigset_t *set, sigset_t *oldset);
> +int sigfillset(sigset_t *set);
> +
> +#endif /* WIN32_SIGNAL_H */
> diff --git a/hosts/w32/include/time.h b/hosts/w32/include/time.h
> new file mode 100644
> index 0000000..0b997d3
> --- /dev/null
> +++ b/hosts/w32/include/time.h
> @@ -0,0 +1,31 @@
> +/*
> + * QEMU w32 support
> + *
> + * Copyright (C) 2011 Stefan Weil
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#if !defined(W32_TIME_H)
> +#define W32_TIME_H
> +
> +#include_next <time.h>
> +
> +#ifndef HAVE_STRUCT_TIMESPEC
> +#define HAVE_STRUCT_TIMESPEC 1
> +struct timespec {
> +        long tv_sec;
> +        long tv_nsec;
> +};
> +#endif /* HAVE_STRUCT_TIMESPEC */
> +
> +typedef enum {
> +  CLOCK_REALTIME = 0
> +} clockid_t;
> +
> +int clock_getres (clockid_t clock_id, struct timespec *res);
> +int clock_gettime(clockid_t clock_id, struct timespec *pTimespec);
> +
> +#endif /* W32_TIME_H */
> diff --git a/os-win32.c b/os-win32.c
> index b214e6a..7778366 100644
> --- a/os-win32.c
> +++ b/os-win32.c
> @@ -36,6 +36,45 @@
>  /***********************************************************/
>  /* Functions missing in mingw */
>
> +#if defined(CONFIG_THREAD)
> +
> +int clock_gettime(clockid_t clock_id, struct timespec *pTimespec)
> +{
> +  int result = 0;
> +  if (clock_id == CLOCK_REALTIME && pTimespec != 0) {
> +    DWORD t = GetTickCount();
> +    const unsigned cps = 1000;
> +    struct timespec ts;
> +    ts.tv_sec  = t / cps;
> +    ts.tv_nsec = (t % cps) * (1000000000UL / cps);
> +    *pTimespec = ts;
> +  } else {
> +    errno = EINVAL;
> +    result = -1;
> +  }
> +  return result;
> +}

Why is this needed?  The only user of clock_gettime in the POSIX case is 
using CLOCK_MONOTONIC, and actually has a Win32 version already.

> +int pthread_sigmask(int how, const sigset_t *set, sigset_t *oldset)
> +{
> +    /* Dummy, do nothing. */
> +    return EINVAL;
> +}
> +
> +int sigfillset(sigset_t *set)
> +{
> +    int result = 0;
> +    if (set) {
> +        *(set) = (sigset_t)(-1);
> +    } else {
> +        errno = EINVAL;
> +        result = -1;
> +    }
> +    return result;
> +}

Instead of these, it's better to provide a Win32 implementation of 
mutexes and condvars.  I'll submit it next week hopefully.

Paolo
Stefan Weil - Feb. 10, 2011, 7:46 p.m.
Am 10.02.2011 10:54, schrieb Paolo Bonzini:
> On 02/09/2011 11:16 PM, Stefan Weil wrote:
>> The patch is available here:
>> http://repo.or.cz/w/qemu/ar7.git/commitdiff/aabf11dc0a938b84d76d7c147cbf0445d7bee297 
>>
[snip]
>> diff --git a/os-win32.c b/os-win32.c
>> index b214e6a..7778366 100644
>> --- a/os-win32.c
>> +++ b/os-win32.c
>> @@ -36,6 +36,45 @@
>>  /***********************************************************/
>>  /* Functions missing in mingw */
>>
>> +#if defined(CONFIG_THREAD)
>> +
>> +int clock_gettime(clockid_t clock_id, struct timespec *pTimespec)
>> +{
>> +  int result = 0;
>> +  if (clock_id == CLOCK_REALTIME && pTimespec != 0) {
>> +    DWORD t = GetTickCount();
>> +    const unsigned cps = 1000;
>> +    struct timespec ts;
>> +    ts.tv_sec  = t / cps;
>> +    ts.tv_nsec = (t % cps) * (1000000000UL / cps);
>> +    *pTimespec = ts;
>> +  } else {
>> +    errno = EINVAL;
>> +    result = -1;
>> +  }
>> +  return result;
>> +}
>
> Why is this needed?  The only user of clock_gettime in the POSIX case 
> is using CLOCK_MONOTONIC, and actually has a Win32 version already.


qemu-thread.c uses clock_gettime(CLOCK_REALTIME, ...)


>
>> +int pthread_sigmask(int how, const sigset_t *set, sigset_t *oldset)
>> +{
>> +    /* Dummy, do nothing. */
>> +    return EINVAL;
>> +}
>> +
>> +int sigfillset(sigset_t *set)
>> +{
>> +    int result = 0;
>> +    if (set) {
>> +        *(set) = (sigset_t)(-1);
>> +    } else {
>> +        errno = EINVAL;
>> +        result = -1;
>> +    }
>> +    return result;
>> +}
>
> Instead of these, it's better to provide a Win32 implementation of 
> mutexes and condvars.  I'll submit it next week hopefully.
>
> Paolo


That's good news. My patch was only a quick hack to make threaded VNC work.

Thanks,
Stefan
Marcelo Tosatti - Feb. 11, 2011, 12:03 a.m.
On Wed, Feb 09, 2011 at 09:07:52PM +0100, Anthony Liguori wrote:
> On 02/09/2011 06:35 PM, Aurelien Jarno wrote:
> >On Tue, Feb 08, 2011 at 04:08:28PM +0100, Aurelien Jarno wrote:
> >>Aurelien Jarno a écrit :
> >>>Paolo Bonzini a écrit :
> >>>>On 02/08/2011 12:15 PM, Aurelien Jarno wrote:
> >>>>>however
> >>>>>it should not be done ignoring all the*current*  drawbacks of the
> >>>>>iothread mode. We know them (at least for some of them), so let's try to
> >>>>>solve them.
> >>>>Let's also enumerate them.
> >>>>
> >>> From what I know:
> >>>- performance regression in TCG mode
> >>I setup an x86_64 guest on an x86_64 host (Intel Xeon E5345). Nothing
> >>was running except the standard daemons and the CPU governor was set to
> >>"performance" on all CPU. I then compared the network performance using
> >>netperf in default mode, through a tap interface and a virtio nic. I got
> >>the following results (quite reproducible, std below 0.5):
> >>- without IO thread: 107.36 MB/s
> >>- with IO thread:     89.93 MB/s
> >>
> >And the same test on the code from september 2009:
> >- without IO thread: 141.8 MB/s
> virtio-net is super finicky regarding mitigation strategies and
> their relationship to the I/O thread.  Different benchmarks will
> behave differently.  virtio-blk is probably a better device to test
> as you'll get much more consistent results across different type of
> I/O patterns.

netperf server on guest, RHEL5.4 guest (e1000), uq/master branch, TCG:

iothread: 236MB/s
no iothread: 215MB/s

Also noticed scp was slightly faster with iothread earlier this week,
don't remember numbers.

Patch

diff --git a/cpus.c b/cpus.c
index 9c50a34..2280db1 100644
--- a/cpus.c
+++ b/cpus.c
@@ -748,7 +748,7 @@  static void qemu_tcg_wait_io_event(void)
     CPUState *env;
 
     while (!any_cpu_has_work())
-        qemu_cond_timedwait(tcg_halt_cond, &qemu_global_mutex, 1000);
+        qemu_cond_timedwait(tcg_halt_cond, &qemu_global_mutex, qemu_calculate_timeout());
 
     qemu_mutex_unlock(&qemu_global_mutex);
 
diff --git a/vl.c b/vl.c
index 837be97..dbd81a1 100644
--- a/vl.c
+++ b/vl.c
@@ -1323,7 +1323,7 @@  void main_loop_wait(int nonblocking)
     if (nonblocking)
         timeout = 0;
     else {
-        timeout = qemu_calculate_timeout();
+        timeout = 1000;
         qemu_bh_update_timeout(&timeout);
     }