Patchwork [v3,6/6] qemu_calculate_timeout: increase minimum timeout to 1h

login
register
mail settings
Submitter Stefano Stabellini
Date Jan. 27, 2012, 6:21 p.m.
Message ID <1327688498-12362-6-git-send-email-stefano.stabellini@eu.citrix.com>
Download mbox | patch
Permalink /patch/138274/
State New
Headers show

Comments

Stefano Stabellini - Jan. 27, 2012, 6:21 p.m.
There is no reason why the minimum timeout should be 1sec, it could
easily be 1h and we would save lots of cpu cycles.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 qemu-timer.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)
Paul Brook - Feb. 10, 2012, 12:26 a.m.
> There is no reason why the minimum timeout should be 1sec, it could
> easily be 1h and we would save lots of cpu cycles.

No.

The reason we have this is because there are bits of code that rely on 
polling.  IIRC slirp and the floppy DMA engine were the main culprits. 
qemu_calculate_timeout is an ugly hack to poll at least once a second, 
allowing the guest to make forward progress when we miss an event.

If you think you've fixed all those polling places then you should remove the 
timeout altogether and block indefinitely.  A 1h timeout is almost certainly 
not the right answer.

Paul
Paolo Bonzini - Feb. 10, 2012, 8:03 a.m.
On 02/10/2012 01:26 AM, Paul Brook wrote:
> The reason we have this is because there are bits of code that rely on
> polling.  IIRC slirp and the floppy DMA engine were the main culprits.
> qemu_calculate_timeout is an ugly hack to poll at least once a second,
> allowing the guest to make forward progress when we miss an event.

At least the floppy DMA engine is fine with it, it uses idle bottom 
halves (which are a hack and could be replaced by timers, but that's not 
relevant now).

Slirp's timeouts indeed require polling.

                 if (time_fasttimo && ((curtime - time_fasttimo) >= 2)) {
                         tcp_fasttimo(slirp);
                         time_fasttimo = 0;
                 }
                 if (do_slowtimo && ((curtime - last_slowtimo) >= 499)) {
                         ip_slowtimo(slirp);
                         tcp_slowtimo(slirp);
                         last_slowtimo = curtime;
                 }

Paolo
Paul Brook - Feb. 10, 2012, 9:52 a.m.
> On 02/10/2012 01:26 AM, Paul Brook wrote:
> > The reason we have this is because there are bits of code that rely on
> > polling.  IIRC slirp and the floppy DMA engine were the main culprits.
> > qemu_calculate_timeout is an ugly hack to poll at least once a second,
> > allowing the guest to make forward progress when we miss an event.
> 
> At least the floppy DMA engine is fine with it, it uses idle bottom
> halves (which are a hack and could be replaced by timers, but that's not
> relevant now).

I thought idle bottom halves were one of the things that made this timout 
necessary.  How else are they going to get run?

Paul
Paolo Bonzini - Feb. 10, 2012, 10:45 a.m.
On 02/10/2012 10:52 AM, Paul Brook wrote:
>> >  At least the floppy DMA engine is fine with it, it uses idle bottom
>> >  halves (which are a hack and could be replaced by timers, but that's not
>> >  relevant now).
> I thought idle bottom halves were one of the things that made this timout
> necessary.  How else are they going to get run?

The timeout is reduced to 10 ms when an idle bottom half is scheduled. 
See qemu_bh_update_timeout in async.c.

Paolo
Paul Brook - Feb. 10, 2012, 11:09 a.m.
> >> >  At least the floppy DMA engine is fine with it, it uses idle bottom
> >> >  halves (which are a hack and could be replaced by timers, but that's
> >> >  not relevant now).
> > 
> > I thought idle bottom halves were one of the things that made this timout
> > necessary.  How else are they going to get run?
> 
> The timeout is reduced to 10 ms when an idle bottom half is scheduled.
> See qemu_bh_update_timeout in async.c.

Ah, I see.  Idle BH are indeed a nasty hack that should be removed, but not 
directly relevant to this 1s timeout.

I don't think this changes my overall conlusion:  Either we need this timeout 
to poll below the user-thinks-qemu-died threshold, or we should be blocking 
indefinitely.

Paul
Paolo Bonzini - Feb. 10, 2012, 11:18 a.m.
On 02/10/2012 12:19 PM, Stefano Stabellini wrote:
> I think you are right and the right thing to do would be blocking
> indefinitely.
> However if slirp doesn't support it, we could have a timeout of 1000 if
> CONFIG_SLIRP, otherwise block indefinitely.

You could add a similar hack to qemu_bh_update_timeout for 
slirp_update_timeout.

Paolo
Stefano Stabellini - Feb. 10, 2012, 11:19 a.m.
On Fri, 10 Feb 2012, Paul Brook wrote:
> > >> >  At least the floppy DMA engine is fine with it, it uses idle bottom
> > >> >  halves (which are a hack and could be replaced by timers, but that's
> > >> >  not relevant now).
> > > 
> > > I thought idle bottom halves were one of the things that made this timout
> > > necessary.  How else are they going to get run?
> > 
> > The timeout is reduced to 10 ms when an idle bottom half is scheduled.
> > See qemu_bh_update_timeout in async.c.
> 
> Ah, I see.  Idle BH are indeed a nasty hack that should be removed, but not 
> directly relevant to this 1s timeout.
> 
> I don't think this changes my overall conlusion:  Either we need this timeout 
> to poll below the user-thinks-qemu-died threshold, or we should be blocking 
> indefinitely.
 
I think you are right and the right thing to do would be blocking
indefinitely.
However if slirp doesn't support it, we could have a timeout of 1000 if
CONFIG_SLIRP, otherwise block indefinitely.
Jan Kiszka - Feb. 10, 2012, 11:32 a.m.
On 2012-02-10 12:18, Paolo Bonzini wrote:
> On 02/10/2012 12:19 PM, Stefano Stabellini wrote:
>> I think you are right and the right thing to do would be blocking
>> indefinitely.
>> However if slirp doesn't support it, we could have a timeout of 1000 if
>> CONFIG_SLIRP, otherwise block indefinitely.
> 
> You could add a similar hack to qemu_bh_update_timeout for
> slirp_update_timeout.

Real solutions would be preferred, but I know that the code is hairy. In
any case, please no CONFIG_SLIRP code forks.

Jan

Patch

diff --git a/qemu-timer.c b/qemu-timer.c
index 8c8bbc3..3207e40 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -852,6 +852,6 @@  fail:
 
 int qemu_calculate_timeout(void)
 {
-    return 1000;
+    return 1000*60*60;
 }