Message ID | 20210511081350.419428-31-pbonzini@redhat.com |
---|---|
State | New |
Headers | show |
Series | [PULL,01/33] target/i386: Rename helper_fldt, helper_fstt | expand |
* Paolo Bonzini (pbonzini@redhat.com) wrote: > The HMP loadvm code is calling load_snapshot rather than > qmp_snapshot_load, in order to bypass the job infrastructure. The code > around it is almost the same, with one difference: hmp_loadvm is > restarting the VM if load_snapshot fails, qmp_snapshot_load is doing so > if load_snapshot succeeds. > > Fix the bug in QMP by moving the common code to load_snapshot. See my comment: https://lists.gnu.org/archive/html/qemu-devel/2021-05/msg01103.html but you've also lost Eric's Rb. Dave > Cc: qemu-stable@nongnu.org > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > --- > migration/savevm.c | 16 ++++++++-------- > monitor/hmp-cmds.c | 7 +------ > 2 files changed, 9 insertions(+), 14 deletions(-) > > diff --git a/migration/savevm.c b/migration/savevm.c > index 52e2d72e4b..a899191cbf 100644 > --- a/migration/savevm.c > +++ b/migration/savevm.c > @@ -2992,6 +2992,7 @@ bool load_snapshot(const char *name, const char *vmstate, > int ret; > AioContext *aio_context; > MigrationIncomingState *mis = migration_incoming_get_current(); > + int saved_vm_running = runstate_is_running(); > > if (!bdrv_all_can_snapshot(has_devices, devices, errp)) { > return false; > @@ -3024,6 +3025,8 @@ bool load_snapshot(const char *name, const char *vmstate, > return false; > } > > + vm_stop(RUN_STATE_RESTORE_VM); > + > /* > * Flush the record/replay queue. Now the VM state is going > * to change. Therefore we don't need to preserve its consistency > @@ -3061,13 +3064,17 @@ bool load_snapshot(const char *name, const char *vmstate, > > if (ret < 0) { > error_setg(errp, "Error %d while loading VM state", ret); > - return false; > + goto err_restart; > } > > return true; > > err_drain: > bdrv_drain_all_end(); > +err_restart: > + if (saved_vm_running) { > + vm_start(); > + } > return false; > } > > @@ -3135,17 +3142,10 @@ static void snapshot_load_job_bh(void *opaque) > { > Job *job = opaque; > SnapshotJob *s = container_of(job, SnapshotJob, common); > - int orig_vm_running; > > job_progress_set_remaining(&s->common, 1); > > - orig_vm_running = runstate_is_running(); > - vm_stop(RUN_STATE_RESTORE_VM); > - > s->ret = load_snapshot(s->tag, s->vmstate, true, s->devices, s->errp); > - if (s->ret && orig_vm_running) { > - vm_start(); > - } > > job_progress_update(&s->common, 1); > > diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c > index 0ad5b77477..a39436c8cb 100644 > --- a/monitor/hmp-cmds.c > +++ b/monitor/hmp-cmds.c > @@ -1127,15 +1127,10 @@ void hmp_balloon(Monitor *mon, const QDict *qdict) > > void hmp_loadvm(Monitor *mon, const QDict *qdict) > { > - int saved_vm_running = runstate_is_running(); > const char *name = qdict_get_str(qdict, "name"); > Error *err = NULL; > > - vm_stop(RUN_STATE_RESTORE_VM); > - > - if (!load_snapshot(name, NULL, false, NULL, &err) && saved_vm_running) { > - vm_start(); > - } > + load_snapshot(name, NULL, false, NULL, &err); > hmp_handle_error(mon, err); > } > > -- > 2.26.2 > > >
On Tue, May 11, 2021 at 04:13:47AM -0400, Paolo Bonzini wrote: > The HMP loadvm code is calling load_snapshot rather than > qmp_snapshot_load, in order to bypass the job infrastructure. The code > around it is almost the same, with one difference: hmp_loadvm is > restarting the VM if load_snapshot fails, qmp_snapshot_load is doing so > if load_snapshot succeeds. > > Fix the bug in QMP by moving the common code to load_snapshot. > > Cc: qemu-stable@nongnu.org > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > --- > migration/savevm.c | 16 ++++++++-------- > monitor/hmp-cmds.c | 7 +------ > 2 files changed, 9 insertions(+), 14 deletions(-) David had a question about safety on this patch that probably ought to be answered before merging https://lists.gnu.org/archive/html/qemu-devel/2021-05/msg01103.html Regards, Daniel
On Tue, May 11, 2021 at 04:13:47AM -0400, Paolo Bonzini wrote: > The HMP loadvm code is calling load_snapshot rather than > qmp_snapshot_load, in order to bypass the job infrastructure. The code > around it is almost the same, with one difference: hmp_loadvm is > restarting the VM if load_snapshot fails, qmp_snapshot_load is doing so > if load_snapshot succeeds. > > Fix the bug in QMP by moving the common code to load_snapshot. This doesn't appear to have actually fixed the HMP regression. Instead I think it is duplicated the HMP bug in the QMP code too. See the 068 iotest enhancement here that validates the expected state on success: https://lists.gnu.org/archive/html/qemu-devel/2021-05/msg03091.html Regards, Daniel
Am 11.05.2021 um 19:06 hat Daniel P. Berrangé geschrieben: > On Tue, May 11, 2021 at 04:13:47AM -0400, Paolo Bonzini wrote: > > The HMP loadvm code is calling load_snapshot rather than > > qmp_snapshot_load, in order to bypass the job infrastructure. The code > > around it is almost the same, with one difference: hmp_loadvm is > > restarting the VM if load_snapshot fails, qmp_snapshot_load is doing so > > if load_snapshot succeeds. > > > > Fix the bug in QMP by moving the common code to load_snapshot. > > This doesn't appear to have actually fixed the HMP regression. > Instead I think it is duplicated the HMP bug in the QMP code > too. > > See the 068 iotest enhancement here that validates the expected > state on success: > > https://lists.gnu.org/archive/html/qemu-devel/2021-05/msg03091.html I guess I'll wait for this new QMP bug to hit master, and then rebase my fix on top of it, with now two Fixes: lines. Or do you want to send a v2 of this pull request without this patch, Paolo? Kevin
On 11/05/21 10:56, Dr. David Alan Gilbert wrote: > * Paolo Bonzini (pbonzini@redhat.com) wrote: >> The HMP loadvm code is calling load_snapshot rather than >> qmp_snapshot_load, in order to bypass the job infrastructure. The code >> around it is almost the same, with one difference: hmp_loadvm is >> restarting the VM if load_snapshot fails, qmp_snapshot_load is doing so >> if load_snapshot succeeds. >> >> Fix the bug in QMP by moving the common code to load_snapshot. > > See my comment: > https://lists.gnu.org/archive/html/qemu-devel/2021-05/msg01103.html > > but you've also lost Eric's Rb. Sorry, I missed both replies. As Daniel pointed out, the bug is in the QMP version. Kevin has the correct fix, I'll send the cleanup of vm_stop/vm_start separately on top of his patch. Paolo > Dave > >> Cc: qemu-stable@nongnu.org >> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> >> --- >> migration/savevm.c | 16 ++++++++-------- >> monitor/hmp-cmds.c | 7 +------ >> 2 files changed, 9 insertions(+), 14 deletions(-) >> >> diff --git a/migration/savevm.c b/migration/savevm.c >> index 52e2d72e4b..a899191cbf 100644 >> --- a/migration/savevm.c >> +++ b/migration/savevm.c >> @@ -2992,6 +2992,7 @@ bool load_snapshot(const char *name, const char *vmstate, >> int ret; >> AioContext *aio_context; >> MigrationIncomingState *mis = migration_incoming_get_current(); >> + int saved_vm_running = runstate_is_running(); >> >> if (!bdrv_all_can_snapshot(has_devices, devices, errp)) { >> return false; >> @@ -3024,6 +3025,8 @@ bool load_snapshot(const char *name, const char *vmstate, >> return false; >> } >> >> + vm_stop(RUN_STATE_RESTORE_VM); >> + >> /* >> * Flush the record/replay queue. Now the VM state is going >> * to change. Therefore we don't need to preserve its consistency >> @@ -3061,13 +3064,17 @@ bool load_snapshot(const char *name, const char *vmstate, >> >> if (ret < 0) { >> error_setg(errp, "Error %d while loading VM state", ret); >> - return false; >> + goto err_restart; >> } >> >> return true; >> >> err_drain: >> bdrv_drain_all_end(); >> +err_restart: >> + if (saved_vm_running) { >> + vm_start(); >> + } >> return false; >> } >> >> @@ -3135,17 +3142,10 @@ static void snapshot_load_job_bh(void *opaque) >> { >> Job *job = opaque; >> SnapshotJob *s = container_of(job, SnapshotJob, common); >> - int orig_vm_running; >> >> job_progress_set_remaining(&s->common, 1); >> >> - orig_vm_running = runstate_is_running(); >> - vm_stop(RUN_STATE_RESTORE_VM); >> - >> s->ret = load_snapshot(s->tag, s->vmstate, true, s->devices, s->errp); >> - if (s->ret && orig_vm_running) { >> - vm_start(); >> - } >> >> job_progress_update(&s->common, 1); >> >> diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c >> index 0ad5b77477..a39436c8cb 100644 >> --- a/monitor/hmp-cmds.c >> +++ b/monitor/hmp-cmds.c >> @@ -1127,15 +1127,10 @@ void hmp_balloon(Monitor *mon, const QDict *qdict) >> >> void hmp_loadvm(Monitor *mon, const QDict *qdict) >> { >> - int saved_vm_running = runstate_is_running(); >> const char *name = qdict_get_str(qdict, "name"); >> Error *err = NULL; >> >> - vm_stop(RUN_STATE_RESTORE_VM); >> - >> - if (!load_snapshot(name, NULL, false, NULL, &err) && saved_vm_running) { >> - vm_start(); >> - } >> + load_snapshot(name, NULL, false, NULL, &err); >> hmp_handle_error(mon, err); >> } >> >> -- >> 2.26.2 >> >> >>
On 12/05/21 09:45, Kevin Wolf wrote: > I guess I'll wait for this new QMP bug to hit master, and then rebase my > fix on top of it, with now two Fixes: lines. > > Or do you want to send a v2 of this pull request without this patch, > Paolo? I managed to remove it from the tag before Peter pulled, so problem averted. Paolo
diff --git a/migration/savevm.c b/migration/savevm.c index 52e2d72e4b..a899191cbf 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -2992,6 +2992,7 @@ bool load_snapshot(const char *name, const char *vmstate, int ret; AioContext *aio_context; MigrationIncomingState *mis = migration_incoming_get_current(); + int saved_vm_running = runstate_is_running(); if (!bdrv_all_can_snapshot(has_devices, devices, errp)) { return false; @@ -3024,6 +3025,8 @@ bool load_snapshot(const char *name, const char *vmstate, return false; } + vm_stop(RUN_STATE_RESTORE_VM); + /* * Flush the record/replay queue. Now the VM state is going * to change. Therefore we don't need to preserve its consistency @@ -3061,13 +3064,17 @@ bool load_snapshot(const char *name, const char *vmstate, if (ret < 0) { error_setg(errp, "Error %d while loading VM state", ret); - return false; + goto err_restart; } return true; err_drain: bdrv_drain_all_end(); +err_restart: + if (saved_vm_running) { + vm_start(); + } return false; } @@ -3135,17 +3142,10 @@ static void snapshot_load_job_bh(void *opaque) { Job *job = opaque; SnapshotJob *s = container_of(job, SnapshotJob, common); - int orig_vm_running; job_progress_set_remaining(&s->common, 1); - orig_vm_running = runstate_is_running(); - vm_stop(RUN_STATE_RESTORE_VM); - s->ret = load_snapshot(s->tag, s->vmstate, true, s->devices, s->errp); - if (s->ret && orig_vm_running) { - vm_start(); - } job_progress_update(&s->common, 1); diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c index 0ad5b77477..a39436c8cb 100644 --- a/monitor/hmp-cmds.c +++ b/monitor/hmp-cmds.c @@ -1127,15 +1127,10 @@ void hmp_balloon(Monitor *mon, const QDict *qdict) void hmp_loadvm(Monitor *mon, const QDict *qdict) { - int saved_vm_running = runstate_is_running(); const char *name = qdict_get_str(qdict, "name"); Error *err = NULL; - vm_stop(RUN_STATE_RESTORE_VM); - - if (!load_snapshot(name, NULL, false, NULL, &err) && saved_vm_running) { - vm_start(); - } + load_snapshot(name, NULL, false, NULL, &err); hmp_handle_error(mon, err); }
The HMP loadvm code is calling load_snapshot rather than qmp_snapshot_load, in order to bypass the job infrastructure. The code around it is almost the same, with one difference: hmp_loadvm is restarting the VM if load_snapshot fails, qmp_snapshot_load is doing so if load_snapshot succeeds. Fix the bug in QMP by moving the common code to load_snapshot. Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- migration/savevm.c | 16 ++++++++-------- monitor/hmp-cmds.c | 7 +------ 2 files changed, 9 insertions(+), 14 deletions(-)