Message ID | 1478265017-5700-1-git-send-email-thuth@redhat.com |
---|---|
State | New |
Headers | show |
On Fri, Nov 04, 2016 at 02:10:17PM +0100, Thomas Huth wrote: > qemu_savevm_state_iterate() expects the iterators to return 1 > when they are done, and 0 if there is still something left to do. > However, ram_save_iterate() does not obey this rule and returns > the number of saved pages instead. This causes a fatal hang with > ppc64 guests when you run QEMU like this (also works with TCG): > > qemu-img create -f qcow2 /tmp/test.qcow2 1M > qemu-system-ppc64 -nographic -nodefaults -m 256 \ > -hda /tmp/test.qcow2 -serial mon:stdio > > ... then switch to the monitor by pressing CTRL-a c and try to > save a snapshot with "savevm test1" for example. > > After the first iteration, ram_save_iterate() always returns 0 here, > so that qemu_savevm_state_iterate() hangs in an endless loop and you > can only "kill -9" the QEMU process. > Fix it by using proper return values in ram_save_iterate(). > > Signed-off-by: Thomas Huth <thuth@redhat.com> Hmm. I think the change is technically correct, but I'm uneasy with this approach to the solution. The whole reason this wasn't caught earlier is that almost nothing looks at the return value. Without changing that I think it's very likely someone will mess this up again. I think it would be preferable to change the return type to void to make it explicit that this function is not directly returning the "completion" status, but instead that's calculated from the other progress variables it updates. > --- > migration/ram.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/migration/ram.c b/migration/ram.c > index fb9252d..a1c8089 100644 > --- a/migration/ram.c > +++ b/migration/ram.c > @@ -1987,7 +1987,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) > int ret; > int i; > int64_t t0; > - int pages_sent = 0; > + int done = 0; > > rcu_read_lock(); > if (ram_list.version != last_version) { > @@ -2007,9 +2007,9 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) > pages = ram_find_and_save_block(f, false, &bytes_transferred); > /* no more pages to sent */ > if (pages == 0) { > + done = 1; > break; > } > - pages_sent += pages; > acct_info.iterations++; > > /* we want to check in the 1st loop, just in case it was the 1st time > @@ -2044,7 +2044,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) > return ret; > } > > - return pages_sent; > + return done; > } > > /* Called with iothread lock */
On 08.11.2016 02:14, David Gibson wrote: > On Fri, Nov 04, 2016 at 02:10:17PM +0100, Thomas Huth wrote: >> qemu_savevm_state_iterate() expects the iterators to return 1 >> when they are done, and 0 if there is still something left to do. >> However, ram_save_iterate() does not obey this rule and returns >> the number of saved pages instead. This causes a fatal hang with >> ppc64 guests when you run QEMU like this (also works with TCG): >> >> qemu-img create -f qcow2 /tmp/test.qcow2 1M >> qemu-system-ppc64 -nographic -nodefaults -m 256 \ >> -hda /tmp/test.qcow2 -serial mon:stdio >> >> ... then switch to the monitor by pressing CTRL-a c and try to >> save a snapshot with "savevm test1" for example. >> >> After the first iteration, ram_save_iterate() always returns 0 here, >> so that qemu_savevm_state_iterate() hangs in an endless loop and you >> can only "kill -9" the QEMU process. >> Fix it by using proper return values in ram_save_iterate(). >> >> Signed-off-by: Thomas Huth <thuth@redhat.com> > > Hmm. I think the change is technically correct, but I'm uneasy with > this approach to the solution. The whole reason this wasn't caught > earlier is that almost nothing looks at the return value. Without > changing that I think it's very likely someone will mess this up > again. > > I think it would be preferable to change the return type to void to > make it explicit that this function is not directly returning the > "completion" status, but instead that's calculated from the other > progress variables it updates. Not sure how such a patch should finally look like. Could you propose a patch? Anyway, we're in soft freeze already ... do we still want such a major change of the logic at this point in time? If not, we should maybe go with fixing the return type only for 2.8, and do the major change for 2.9 instead? Thomas
On (Fri) 04 Nov 2016 [14:10:17], Thomas Huth wrote: > qemu_savevm_state_iterate() expects the iterators to return 1 > when they are done, and 0 if there is still something left to do. > However, ram_save_iterate() does not obey this rule and returns > the number of saved pages instead. This causes a fatal hang with > ppc64 guests when you run QEMU like this (also works with TCG): "works with" -- does that mean reproduces with? > qemu-img create -f qcow2 /tmp/test.qcow2 1M > qemu-system-ppc64 -nographic -nodefaults -m 256 \ > -hda /tmp/test.qcow2 -serial mon:stdio > > ... then switch to the monitor by pressing CTRL-a c and try to > save a snapshot with "savevm test1" for example. > > After the first iteration, ram_save_iterate() always returns 0 here, > so that qemu_savevm_state_iterate() hangs in an endless loop and you > can only "kill -9" the QEMU process. > Fix it by using proper return values in ram_save_iterate(). > > Signed-off-by: Thomas Huth <thuth@redhat.com> > --- > migration/ram.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/migration/ram.c b/migration/ram.c > index fb9252d..a1c8089 100644 > --- a/migration/ram.c > +++ b/migration/ram.c > @@ -1987,7 +1987,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) > int ret; > int i; > int64_t t0; > - int pages_sent = 0; > + int done = 0; > > rcu_read_lock(); > if (ram_list.version != last_version) { > @@ -2007,9 +2007,9 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) > pages = ram_find_and_save_block(f, false, &bytes_transferred); > /* no more pages to sent */ > if (pages == 0) { > + done = 1; > break; > } > - pages_sent += pages; > acct_info.iterations++; > > /* we want to check in the 1st loop, just in case it was the 1st time > @@ -2044,7 +2044,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) > return ret; > } > > - return pages_sent; > + return done; > } I agree with David, we can just remove the return value. The first patch of the series can do that; and this one could become the 2nd patch. Should be OK for the soft freeze. Amit
On 09.11.2016 08:18, Amit Shah wrote: > On (Fri) 04 Nov 2016 [14:10:17], Thomas Huth wrote: >> qemu_savevm_state_iterate() expects the iterators to return 1 >> when they are done, and 0 if there is still something left to do. >> However, ram_save_iterate() does not obey this rule and returns >> the number of saved pages instead. This causes a fatal hang with >> ppc64 guests when you run QEMU like this (also works with TCG): > > "works with" -- does that mean reproduces with? Yes, that's what I've meant: You can reproduce it with TCG (e.g. running on a x86 system), too, there's no need for a real POWER machine with KVM here. >> qemu-img create -f qcow2 /tmp/test.qcow2 1M >> qemu-system-ppc64 -nographic -nodefaults -m 256 \ >> -hda /tmp/test.qcow2 -serial mon:stdio >> >> ... then switch to the monitor by pressing CTRL-a c and try to >> save a snapshot with "savevm test1" for example. >> >> After the first iteration, ram_save_iterate() always returns 0 here, >> so that qemu_savevm_state_iterate() hangs in an endless loop and you >> can only "kill -9" the QEMU process. >> Fix it by using proper return values in ram_save_iterate(). >> >> Signed-off-by: Thomas Huth <thuth@redhat.com> >> --- >> migration/ram.c | 6 +++--- >> 1 file changed, 3 insertions(+), 3 deletions(-) >> >> diff --git a/migration/ram.c b/migration/ram.c >> index fb9252d..a1c8089 100644 >> --- a/migration/ram.c >> +++ b/migration/ram.c >> @@ -1987,7 +1987,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) >> int ret; >> int i; >> int64_t t0; >> - int pages_sent = 0; >> + int done = 0; >> >> rcu_read_lock(); >> if (ram_list.version != last_version) { >> @@ -2007,9 +2007,9 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) >> pages = ram_find_and_save_block(f, false, &bytes_transferred); >> /* no more pages to sent */ >> if (pages == 0) { >> + done = 1; >> break; >> } >> - pages_sent += pages; >> acct_info.iterations++; >> >> /* we want to check in the 1st loop, just in case it was the 1st time >> @@ -2044,7 +2044,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) >> return ret; >> } >> >> - return pages_sent; >> + return done; >> } > > I agree with David, we can just remove the return value. The first > patch of the series can do that; and this one could become the 2nd > patch. Should be OK for the soft freeze. Sorry, I still did not quite get it - if I'd change the return type of ram_save_iterate() and the other iterate functions to "void", how is qemu_savevm_state_iterate() supposed to know whether all iterators are done or not? And other iterators also use negative return values to signal errors - should that then be handled via an "Error **" parameter instead? ... my gut feeling still says that such a bigger rework (we've got to touch all iterators for this!) should rather not be done right in the middle of the freeze period... Thomas
On Wed, Nov 09, 2016 at 08:46:34AM +0100, Thomas Huth wrote: > On 09.11.2016 08:18, Amit Shah wrote: > > On (Fri) 04 Nov 2016 [14:10:17], Thomas Huth wrote: > >> qemu_savevm_state_iterate() expects the iterators to return 1 > >> when they are done, and 0 if there is still something left to do. > >> However, ram_save_iterate() does not obey this rule and returns > >> the number of saved pages instead. This causes a fatal hang with > >> ppc64 guests when you run QEMU like this (also works with TCG): > > > > "works with" -- does that mean reproduces with? > > Yes, that's what I've meant: You can reproduce it with TCG (e.g. running > on a x86 system), too, there's no need for a real POWER machine with KVM > here. > > >> qemu-img create -f qcow2 /tmp/test.qcow2 1M > >> qemu-system-ppc64 -nographic -nodefaults -m 256 \ > >> -hda /tmp/test.qcow2 -serial mon:stdio > >> > >> ... then switch to the monitor by pressing CTRL-a c and try to > >> save a snapshot with "savevm test1" for example. > >> > >> After the first iteration, ram_save_iterate() always returns 0 here, > >> so that qemu_savevm_state_iterate() hangs in an endless loop and you > >> can only "kill -9" the QEMU process. > >> Fix it by using proper return values in ram_save_iterate(). > >> > >> Signed-off-by: Thomas Huth <thuth@redhat.com> > >> --- > >> migration/ram.c | 6 +++--- > >> 1 file changed, 3 insertions(+), 3 deletions(-) > >> > >> diff --git a/migration/ram.c b/migration/ram.c > >> index fb9252d..a1c8089 100644 > >> --- a/migration/ram.c > >> +++ b/migration/ram.c > >> @@ -1987,7 +1987,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) > >> int ret; > >> int i; > >> int64_t t0; > >> - int pages_sent = 0; > >> + int done = 0; > >> > >> rcu_read_lock(); > >> if (ram_list.version != last_version) { > >> @@ -2007,9 +2007,9 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) > >> pages = ram_find_and_save_block(f, false, &bytes_transferred); > >> /* no more pages to sent */ > >> if (pages == 0) { > >> + done = 1; > >> break; > >> } > >> - pages_sent += pages; > >> acct_info.iterations++; > >> > >> /* we want to check in the 1st loop, just in case it was the 1st time > >> @@ -2044,7 +2044,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) > >> return ret; > >> } > >> > >> - return pages_sent; > >> + return done; > >> } > > > > I agree with David, we can just remove the return value. The first > > patch of the series can do that; and this one could become the 2nd > > patch. Should be OK for the soft freeze. > > Sorry, I still did not quite get it - if I'd change the return type of > ram_save_iterate() and the other iterate functions to "void", how is > qemu_savevm_state_iterate() supposed to know whether all iterators are > done or not? It doesn't - it's return value is, in turn, mostly ignored by the caller. On the migration path we already determine whether to proceed or not based purely on the separate state_pending callbacks. For the savevm path, we don't really need the iteration phase at all - we can jump straight to the completion phase, since downtime is not an issue. > And other iterators also use negative return values to > signal errors Ah.. that's a good point. Possibly we should leave in the negative codes for errors and just remove all positive return values. > - should that then be handled via an "Error **" parameter > instead? ... my gut feeling still says that such a bigger rework (we've > got to touch all iterators for this!) should rather not be done right in > the middle of the freeze period... Yeah the errors could - and probably should - be handled with Error ** instead of return codes, but I also wonder if that's too much for soft freeze. I guess that's the call of the migration guys.
* Thomas Huth (thuth@redhat.com) wrote: > On 09.11.2016 08:18, Amit Shah wrote: > > On (Fri) 04 Nov 2016 [14:10:17], Thomas Huth wrote: > >> qemu_savevm_state_iterate() expects the iterators to return 1 > >> when they are done, and 0 if there is still something left to do. > >> However, ram_save_iterate() does not obey this rule and returns > >> the number of saved pages instead. This causes a fatal hang with > >> ppc64 guests when you run QEMU like this (also works with TCG): > > > > "works with" -- does that mean reproduces with? > > Yes, that's what I've meant: You can reproduce it with TCG (e.g. running > on a x86 system), too, there's no need for a real POWER machine with KVM > here. How did you trigger it on x86? Dave > >> qemu-img create -f qcow2 /tmp/test.qcow2 1M > >> qemu-system-ppc64 -nographic -nodefaults -m 256 \ > >> -hda /tmp/test.qcow2 -serial mon:stdio > >> > >> ... then switch to the monitor by pressing CTRL-a c and try to > >> save a snapshot with "savevm test1" for example. > >> > >> After the first iteration, ram_save_iterate() always returns 0 here, > >> so that qemu_savevm_state_iterate() hangs in an endless loop and you > >> can only "kill -9" the QEMU process. > >> Fix it by using proper return values in ram_save_iterate(). > >> > >> Signed-off-by: Thomas Huth <thuth@redhat.com> > >> --- > >> migration/ram.c | 6 +++--- > >> 1 file changed, 3 insertions(+), 3 deletions(-) > >> > >> diff --git a/migration/ram.c b/migration/ram.c > >> index fb9252d..a1c8089 100644 > >> --- a/migration/ram.c > >> +++ b/migration/ram.c > >> @@ -1987,7 +1987,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) > >> int ret; > >> int i; > >> int64_t t0; > >> - int pages_sent = 0; > >> + int done = 0; > >> > >> rcu_read_lock(); > >> if (ram_list.version != last_version) { > >> @@ -2007,9 +2007,9 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) > >> pages = ram_find_and_save_block(f, false, &bytes_transferred); > >> /* no more pages to sent */ > >> if (pages == 0) { > >> + done = 1; > >> break; > >> } > >> - pages_sent += pages; > >> acct_info.iterations++; > >> > >> /* we want to check in the 1st loop, just in case it was the 1st time > >> @@ -2044,7 +2044,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) > >> return ret; > >> } > >> > >> - return pages_sent; > >> + return done; > >> } > > > > I agree with David, we can just remove the return value. The first > > patch of the series can do that; and this one could become the 2nd > > patch. Should be OK for the soft freeze. > > Sorry, I still did not quite get it - if I'd change the return type of > ram_save_iterate() and the other iterate functions to "void", how is > qemu_savevm_state_iterate() supposed to know whether all iterators are > done or not? And other iterators also use negative return values to > signal errors - should that then be handled via an "Error **" parameter > instead? ... my gut feeling still says that such a bigger rework (we've > got to touch all iterators for this!) should rather not be done right in > the middle of the freeze period... > > Thomas > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On 09.11.2016 16:13, Dr. David Alan Gilbert wrote: > * Thomas Huth (thuth@redhat.com) wrote: >> On 09.11.2016 08:18, Amit Shah wrote: >>> On (Fri) 04 Nov 2016 [14:10:17], Thomas Huth wrote: >>>> qemu_savevm_state_iterate() expects the iterators to return 1 >>>> when they are done, and 0 if there is still something left to do. >>>> However, ram_save_iterate() does not obey this rule and returns >>>> the number of saved pages instead. This causes a fatal hang with >>>> ppc64 guests when you run QEMU like this (also works with TCG): >>> >>> "works with" -- does that mean reproduces with? >> >> Yes, that's what I've meant: You can reproduce it with TCG (e.g. running >> on a x86 system), too, there's no need for a real POWER machine with KVM >> here. > > How did you trigger it on x86? As described below - qemu-img + qemu-system-ppc64 + savevm is enough to trigger it on a x86 host. > >>>> qemu-img create -f qcow2 /tmp/test.qcow2 1M >>>> qemu-system-ppc64 -nographic -nodefaults -m 256 \ >>>> -hda /tmp/test.qcow2 -serial mon:stdio >>>> >>>> ... then switch to the monitor by pressing CTRL-a c and try to >>>> save a snapshot with "savevm test1" for example. Thomas
* Thomas Huth (thuth@redhat.com) wrote: > On 09.11.2016 16:13, Dr. David Alan Gilbert wrote: > > * Thomas Huth (thuth@redhat.com) wrote: > >> On 09.11.2016 08:18, Amit Shah wrote: > >>> On (Fri) 04 Nov 2016 [14:10:17], Thomas Huth wrote: > >>>> qemu_savevm_state_iterate() expects the iterators to return 1 > >>>> when they are done, and 0 if there is still something left to do. > >>>> However, ram_save_iterate() does not obey this rule and returns > >>>> the number of saved pages instead. This causes a fatal hang with > >>>> ppc64 guests when you run QEMU like this (also works with TCG): > >>> > >>> "works with" -- does that mean reproduces with? > >> > >> Yes, that's what I've meant: You can reproduce it with TCG (e.g. running > >> on a x86 system), too, there's no need for a real POWER machine with KVM > >> here. > > > > How did you trigger it on x86? > > As described below - qemu-img + qemu-system-ppc64 + savevm is enough to > trigger it on a x86 host. Oh OK; so yes still ppc64 target. Dave > > > >>>> qemu-img create -f qcow2 /tmp/test.qcow2 1M > >>>> qemu-system-ppc64 -nographic -nodefaults -m 256 \ > >>>> -hda /tmp/test.qcow2 -serial mon:stdio > >>>> > >>>> ... then switch to the monitor by pressing CTRL-a c and try to > >>>> save a snapshot with "savevm test1" for example. > > Thomas > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Thomas Huth <thuth@redhat.com> wrote: > qemu_savevm_state_iterate() expects the iterators to return 1 > when they are done, and 0 if there is still something left to do. > However, ram_save_iterate() does not obey this rule and returns > the number of saved pages instead. This causes a fatal hang with > ppc64 guests when you run QEMU like this (also works with TCG): > > qemu-img create -f qcow2 /tmp/test.qcow2 1M > qemu-system-ppc64 -nographic -nodefaults -m 256 \ > -hda /tmp/test.qcow2 -serial mon:stdio > > ... then switch to the monitor by pressing CTRL-a c and try to > save a snapshot with "savevm test1" for example. > > After the first iteration, ram_save_iterate() always returns 0 here, > so that qemu_savevm_state_iterate() hangs in an endless loop and you > can only "kill -9" the QEMU process. > Fix it by using proper return values in ram_save_iterate(). > > Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Applied. I don't know how we broked this so much. Thanks, Juan.
On Mon, Nov 14, 2016 at 07:34:59PM +0100, Juan Quintela wrote: > Thomas Huth <thuth@redhat.com> wrote: > > qemu_savevm_state_iterate() expects the iterators to return 1 > > when they are done, and 0 if there is still something left to do. > > However, ram_save_iterate() does not obey this rule and returns > > the number of saved pages instead. This causes a fatal hang with > > ppc64 guests when you run QEMU like this (also works with TCG): > > > > qemu-img create -f qcow2 /tmp/test.qcow2 1M > > qemu-system-ppc64 -nographic -nodefaults -m 256 \ > > -hda /tmp/test.qcow2 -serial mon:stdio > > > > ... then switch to the monitor by pressing CTRL-a c and try to > > save a snapshot with "savevm test1" for example. > > > > After the first iteration, ram_save_iterate() always returns 0 here, > > so that qemu_savevm_state_iterate() hangs in an endless loop and you > > can only "kill -9" the QEMU process. > > Fix it by using proper return values in ram_save_iterate(). > > > > Signed-off-by: Thomas Huth <thuth@redhat.com> > > Reviewed-by: Juan Quintela <quintela@redhat.com> > > Applied. > > I don't know how we broked this so much. Note that block save iterate has the same bug...
On 17.11.2016 04:45, David Gibson wrote: > On Mon, Nov 14, 2016 at 07:34:59PM +0100, Juan Quintela wrote: >> Thomas Huth <thuth@redhat.com> wrote: >>> qemu_savevm_state_iterate() expects the iterators to return 1 >>> when they are done, and 0 if there is still something left to do. >>> However, ram_save_iterate() does not obey this rule and returns >>> the number of saved pages instead. This causes a fatal hang with >>> ppc64 guests when you run QEMU like this (also works with TCG): >>> >>> qemu-img create -f qcow2 /tmp/test.qcow2 1M >>> qemu-system-ppc64 -nographic -nodefaults -m 256 \ >>> -hda /tmp/test.qcow2 -serial mon:stdio >>> >>> ... then switch to the monitor by pressing CTRL-a c and try to >>> save a snapshot with "savevm test1" for example. >>> >>> After the first iteration, ram_save_iterate() always returns 0 here, >>> so that qemu_savevm_state_iterate() hangs in an endless loop and you >>> can only "kill -9" the QEMU process. >>> Fix it by using proper return values in ram_save_iterate(). >>> >>> Signed-off-by: Thomas Huth <thuth@redhat.com> >> >> Reviewed-by: Juan Quintela <quintela@redhat.com> >> >> Applied. >> >> I don't know how we broked this so much. > > Note that block save iterate has the same bug... I think you're right. Care to send a patch? Thomas
On 18.11.2016 09:13, Thomas Huth wrote: > On 17.11.2016 04:45, David Gibson wrote: >> On Mon, Nov 14, 2016 at 07:34:59PM +0100, Juan Quintela wrote: >>> Thomas Huth <thuth@redhat.com> wrote: >>>> qemu_savevm_state_iterate() expects the iterators to return 1 >>>> when they are done, and 0 if there is still something left to do. >>>> However, ram_save_iterate() does not obey this rule and returns >>>> the number of saved pages instead. This causes a fatal hang with >>>> ppc64 guests when you run QEMU like this (also works with TCG): >>>> >>>> qemu-img create -f qcow2 /tmp/test.qcow2 1M >>>> qemu-system-ppc64 -nographic -nodefaults -m 256 \ >>>> -hda /tmp/test.qcow2 -serial mon:stdio >>>> >>>> ... then switch to the monitor by pressing CTRL-a c and try to >>>> save a snapshot with "savevm test1" for example. >>>> >>>> After the first iteration, ram_save_iterate() always returns 0 here, >>>> so that qemu_savevm_state_iterate() hangs in an endless loop and you >>>> can only "kill -9" the QEMU process. >>>> Fix it by using proper return values in ram_save_iterate(). >>>> >>>> Signed-off-by: Thomas Huth <thuth@redhat.com> >>> >>> Reviewed-by: Juan Quintela <quintela@redhat.com> >>> >>> Applied. >>> >>> I don't know how we broked this so much. >> >> Note that block save iterate has the same bug... > > I think you're right. Care to send a patch? Looking at this issue again ... could it be that block_save_iterate() is currently just dead code? As far as I can see, the ->save_live_iterate() handlers are only called from qemu_savevm_state_iterate(), right? And qemu_savevm_state_iterate() only calls the handlers if se->ops->is_active(se->opaque) returns true. But block_is_active() seems to only return 0 during savevm, most likely because qemu_savevm_state() explicitly sets the "blk" and "shared" MigrationParams to zero. So to me, it looks like we could also just remove block_save_iterate() completely ... or did I miss something here? Thomas
* Thomas Huth (thuth@redhat.com) wrote: > On 18.11.2016 09:13, Thomas Huth wrote: > > On 17.11.2016 04:45, David Gibson wrote: > >> On Mon, Nov 14, 2016 at 07:34:59PM +0100, Juan Quintela wrote: > >>> Thomas Huth <thuth@redhat.com> wrote: > >>>> qemu_savevm_state_iterate() expects the iterators to return 1 > >>>> when they are done, and 0 if there is still something left to do. > >>>> However, ram_save_iterate() does not obey this rule and returns > >>>> the number of saved pages instead. This causes a fatal hang with > >>>> ppc64 guests when you run QEMU like this (also works with TCG): > >>>> > >>>> qemu-img create -f qcow2 /tmp/test.qcow2 1M > >>>> qemu-system-ppc64 -nographic -nodefaults -m 256 \ > >>>> -hda /tmp/test.qcow2 -serial mon:stdio > >>>> > >>>> ... then switch to the monitor by pressing CTRL-a c and try to > >>>> save a snapshot with "savevm test1" for example. > >>>> > >>>> After the first iteration, ram_save_iterate() always returns 0 here, > >>>> so that qemu_savevm_state_iterate() hangs in an endless loop and you > >>>> can only "kill -9" the QEMU process. > >>>> Fix it by using proper return values in ram_save_iterate(). > >>>> > >>>> Signed-off-by: Thomas Huth <thuth@redhat.com> > >>> > >>> Reviewed-by: Juan Quintela <quintela@redhat.com> > >>> > >>> Applied. > >>> > >>> I don't know how we broked this so much. > >> > >> Note that block save iterate has the same bug... > > > > I think you're right. Care to send a patch? > > Looking at this issue again ... could it be that block_save_iterate() is > currently just dead code? > As far as I can see, the ->save_live_iterate() handlers are only called > from qemu_savevm_state_iterate(), right? And qemu_savevm_state_iterate() > only calls the handlers if se->ops->is_active(se->opaque) returns true. > But block_is_active() seems to only return 0 during savevm, most likely > because qemu_savevm_state() explicitly sets the "blk" and "shared" > MigrationParams to zero. > So to me, it looks like we could also just remove block_save_iterate() > completely ... or did I miss something here? Doesn't it get called by migrate -b ? Dave > Thomas > > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On 16.12.2016 18:03, Dr. David Alan Gilbert wrote: > * Thomas Huth (thuth@redhat.com) wrote: >> On 18.11.2016 09:13, Thomas Huth wrote: >>> On 17.11.2016 04:45, David Gibson wrote: >>>> On Mon, Nov 14, 2016 at 07:34:59PM +0100, Juan Quintela wrote: >>>>> Thomas Huth <thuth@redhat.com> wrote: >>>>>> qemu_savevm_state_iterate() expects the iterators to return 1 >>>>>> when they are done, and 0 if there is still something left to do. >>>>>> However, ram_save_iterate() does not obey this rule and returns >>>>>> the number of saved pages instead. This causes a fatal hang with >>>>>> ppc64 guests when you run QEMU like this (also works with TCG): >>>>>> >>>>>> qemu-img create -f qcow2 /tmp/test.qcow2 1M >>>>>> qemu-system-ppc64 -nographic -nodefaults -m 256 \ >>>>>> -hda /tmp/test.qcow2 -serial mon:stdio >>>>>> >>>>>> ... then switch to the monitor by pressing CTRL-a c and try to >>>>>> save a snapshot with "savevm test1" for example. >>>>>> >>>>>> After the first iteration, ram_save_iterate() always returns 0 here, >>>>>> so that qemu_savevm_state_iterate() hangs in an endless loop and you >>>>>> can only "kill -9" the QEMU process. >>>>>> Fix it by using proper return values in ram_save_iterate(). >>>>>> >>>>>> Signed-off-by: Thomas Huth <thuth@redhat.com> >>>>> >>>>> Reviewed-by: Juan Quintela <quintela@redhat.com> >>>>> >>>>> Applied. >>>>> >>>>> I don't know how we broked this so much. >>>> >>>> Note that block save iterate has the same bug... >>> >>> I think you're right. Care to send a patch? >> >> Looking at this issue again ... could it be that block_save_iterate() is >> currently just dead code? >> As far as I can see, the ->save_live_iterate() handlers are only called >> from qemu_savevm_state_iterate(), right? And qemu_savevm_state_iterate() >> only calls the handlers if se->ops->is_active(se->opaque) returns true. >> But block_is_active() seems to only return 0 during savevm, most likely >> because qemu_savevm_state() explicitly sets the "blk" and "shared" >> MigrationParams to zero. >> So to me, it looks like we could also just remove block_save_iterate() >> completely ... or did I miss something here? > > Doesn't it get called by migrate -b ? Ah, right, yes, I somehow missed that ... I probably shouldn't do such experiments at the end of Friday afternoon ;-) OK, so it seems that - block_save_iterate() is not called during savevm at all (and thus the bad return code does not matter here) - migrate -b runs block_save_iterate() but the return code is ignored in migration_thread() So we do not have a real problem here, but I think we should still clean up the return code of block_save_iterate() to be on the safe side for the future... Thomas
On 12/19/2016 11:30 AM, Thomas Huth wrote: > On 16.12.2016 18:03, Dr. David Alan Gilbert wrote: >> * Thomas Huth (thuth@redhat.com) wrote: >>> On 18.11.2016 09:13, Thomas Huth wrote: >>>> On 17.11.2016 04:45, David Gibson wrote: >>>>> On Mon, Nov 14, 2016 at 07:34:59PM +0100, Juan Quintela wrote: >>>>>> Thomas Huth <thuth@redhat.com> wrote: >>>>>>> qemu_savevm_state_iterate() expects the iterators to return 1 >>>>>>> when they are done, and 0 if there is still something left to do. >>>>>>> However, ram_save_iterate() does not obey this rule and returns >>>>>>> the number of saved pages instead. This causes a fatal hang with >>>>>>> ppc64 guests when you run QEMU like this (also works with TCG): >>>>>>> >>>>>>> qemu-img create -f qcow2 /tmp/test.qcow2 1M >>>>>>> qemu-system-ppc64 -nographic -nodefaults -m 256 \ >>>>>>> -hda /tmp/test.qcow2 -serial mon:stdio >>>>>>> >>>>>>> ... then switch to the monitor by pressing CTRL-a c and try to >>>>>>> save a snapshot with "savevm test1" for example. >>>>>>> >>>>>>> After the first iteration, ram_save_iterate() always returns 0 here, >>>>>>> so that qemu_savevm_state_iterate() hangs in an endless loop and you >>>>>>> can only "kill -9" the QEMU process. >>>>>>> Fix it by using proper return values in ram_save_iterate(). >>>>>>> >>>>>>> Signed-off-by: Thomas Huth <thuth@redhat.com> >>>>>> >>>>>> Reviewed-by: Juan Quintela <quintela@redhat.com> >>>>>> >>>>>> Applied. >>>>>> >>>>>> I don't know how we broked this so much. >>>>> >>>>> Note that block save iterate has the same bug... >>>> >>>> I think you're right. Care to send a patch? >>> >>> Looking at this issue again ... could it be that block_save_iterate() is >>> currently just dead code? >>> As far as I can see, the ->save_live_iterate() handlers are only called >>> from qemu_savevm_state_iterate(), right? And qemu_savevm_state_iterate() >>> only calls the handlers if se->ops->is_active(se->opaque) returns true. >>> But block_is_active() seems to only return 0 during savevm, most likely >>> because qemu_savevm_state() explicitly sets the "blk" and "shared" >>> MigrationParams to zero. >>> So to me, it looks like we could also just remove block_save_iterate() >>> completely ... or did I miss something here? >> >> Doesn't it get called by migrate -b ? > > Ah, right, yes, I somehow missed that ... I probably shouldn't do such > experiments at the end of Friday afternoon ;-) > > OK, so it seems that > - block_save_iterate() is not called during savevm at all > (and thus the bad return code does not matter here) > - migrate -b runs block_save_iterate() but the return code is ignored in > migration_thread() > > So we do not have a real problem here, but I think we should still clean > up the return code of block_save_iterate() to be on the safe side for > the future... > > Thomas > > If it confused you, it'll confuse someone else. Worth fixing for consistency's sake alone. --js
diff --git a/migration/ram.c b/migration/ram.c index fb9252d..a1c8089 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -1987,7 +1987,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) int ret; int i; int64_t t0; - int pages_sent = 0; + int done = 0; rcu_read_lock(); if (ram_list.version != last_version) { @@ -2007,9 +2007,9 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) pages = ram_find_and_save_block(f, false, &bytes_transferred); /* no more pages to sent */ if (pages == 0) { + done = 1; break; } - pages_sent += pages; acct_info.iterations++; /* we want to check in the 1st loop, just in case it was the 1st time @@ -2044,7 +2044,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) return ret; } - return pages_sent; + return done; } /* Called with iothread lock */
qemu_savevm_state_iterate() expects the iterators to return 1 when they are done, and 0 if there is still something left to do. However, ram_save_iterate() does not obey this rule and returns the number of saved pages instead. This causes a fatal hang with ppc64 guests when you run QEMU like this (also works with TCG): qemu-img create -f qcow2 /tmp/test.qcow2 1M qemu-system-ppc64 -nographic -nodefaults -m 256 \ -hda /tmp/test.qcow2 -serial mon:stdio ... then switch to the monitor by pressing CTRL-a c and try to save a snapshot with "savevm test1" for example. After the first iteration, ram_save_iterate() always returns 0 here, so that qemu_savevm_state_iterate() hangs in an endless loop and you can only "kill -9" the QEMU process. Fix it by using proper return values in ram_save_iterate(). Signed-off-by: Thomas Huth <thuth@redhat.com> --- migration/ram.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)