Message ID | 8994198D-0AA5-45C0-8A46-375BCA34E201@hq.newdream.net |
---|---|
State | New |
Headers | show |
On Wed, Dec 09, 2009 at 01:10:18PM -0800, Andrew Farmer wrote: > Right now, if an incoming migrate through exec fails, the qemu process > will end up chewing CPU indefinitely - it looks like it closes the > migration FD but doesn't remove its IO handler properly. An easy way > to reproduce this is to try launching with -incoming exec:/bin/false. > This is obviously useless, but illustrates the issue handily. I've hit this in real life too, with restore from a file containing the saved state which had got corrupted/truncated. I only discovered the failure when I wondered by QEMU was chewing 100% cpu > One solution might be to retry the command on migrate failure, but that > won't really help in all circumstances (for instance, if the migration > command is broken!), so it seems equally appropriate to just die if an > incoming exec migration fails. The patch is trivial, and follows - does > this look sensible? (I'm new to qemu development, but trying to pick it up.) It looks like a reasonable approach to me. If we carried on running, it would be hard for apps to determine whether migration succeeded & thus QEMU is running, or whether it failed and is just idling. By exiting we give the management app/user the optional to retry simply by relaunching > diff --git a/migration-exec.c b/migration-exec.c > index c830669..0292c19 100644 > --- a/migration-exec.c > +++ b/migration-exec.c > @@ -114,7 +114,7 @@ static void exec_accept_incoming_migration(void *opaque) > ret = qemu_loadvm_state(f); > if (ret < 0) { > fprintf(stderr, "load of migration failed\n"); > - goto err; > + exit(0); > } > qemu_announce_self(); > dprintf("successfully loaded vm state\n"); > @@ -123,7 +123,6 @@ static void exec_accept_incoming_migration(void *opaque) > if (autostart) > vm_start(); > > -err: > qemu_fclose(f); > } Daniel
On 11 Dec 2009, at 13:19, Daniel P. Berrange wrote: > On Wed, Dec 09, 2009 at 01:10:18PM -0800, Andrew Farmer wrote: >> Right now, if an incoming migrate through exec fails, the qemu process >> will end up chewing CPU indefinitely - it looks like it closes the >> migration FD but doesn't remove its IO handler properly. An easy way >> to reproduce this is to try launching with -incoming exec:/bin/false. >> This is obviously useless, but illustrates the issue handily. > > I've hit this in real life too, with restore from a file containing > the saved state which had got corrupted/truncated. I only discovered > the failure when I wondered by QEMU was chewing 100% cpu. Hrm... actually, if this also happens on state restore, the problem might not be in migration-exec at all (or there might be multiple bugs with similar symptoms), as the fix I identified was specific to exec failures.
diff --git a/migration-exec.c b/migration-exec.c index c830669..0292c19 100644 --- a/migration-exec.c +++ b/migration-exec.c @@ -114,7 +114,7 @@ static void exec_accept_incoming_migration(void *opaque) ret = qemu_loadvm_state(f); if (ret < 0) { fprintf(stderr, "load of migration failed\n"); - goto err; + exit(0); } qemu_announce_self(); dprintf("successfully loaded vm state\n"); @@ -123,7 +123,6 @@ static void exec_accept_incoming_migration(void *opaque) if (autostart) vm_start(); -err: qemu_fclose(f); }