Message ID | 1292317658-15798-1-git-send-email-tamura.yoshiaki@lab.ntt.co.jp |
---|---|
State | New |
Headers | show |
2010/12/14 Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>: > Although it's rare to happen in live migration, when the head of a > byte stream contains 0x05 which is the marker of subsection, the > loader gets corrupted because vmstate_subsection_load() continues even > the device doesn't require it. This patch adds a checker whether > subsection is needed, and skips following routines if not needed. > > Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> > --- > savevm.c | 8 +++++++- > 1 files changed, 7 insertions(+), 1 deletions(-) > > diff --git a/savevm.c b/savevm.c > index d38f79e..72f6249 100644 > --- a/savevm.c > +++ b/savevm.c > @@ -1633,6 +1633,12 @@ static const VMStateDescription *vmstate_get_subsection(const VMStateSubsection > static int vmstate_subsection_load(QEMUFile *f, const VMStateDescription *vmsd, > void *opaque) > { > + const VMStateSubsection *sub = vmsd->subsections; > + > + if (!sub || !sub->needed) { > + return 0; > + } > + > while (qemu_peek_byte(f) == QEMU_VM_SUBSECTION) { > char idstr[256]; > int ret; > @@ -1645,7 +1651,7 @@ static int vmstate_subsection_load(QEMUFile *f, const VMStateDescription *vmsd, > idstr[len] = 0; > version_id = qemu_get_be32(f); > > - sub_vmsd = vmstate_get_subsection(vmsd->subsections, idstr); > + sub_vmsd = vmstate_get_subsection(sub, idstr); > if (sub_vmsd == NULL) { > return -ENOENT; > } > -- > 1.7.1.2 > > > Hi Juan, This is an error that always happen with Kemari. Could you tell me if I'm fixing incorrectly? Thanks, Yoshi
On 12/14/2010 10:07 AM, Yoshiaki Tamura wrote: > Although it's rare to happen in live migration, when the head of a > byte stream contains 0x05 IIUC, this happens if you have VMS_STRUCT and the field after the VMS_STRUCT starts with 0x5. I think you should also add this in vmstate_subsection_load: sub_vmsd = vmstate_get_subsection(sub, idstr); if (sub_vmsd == NULL) { return -ENOENT; } + assert (!sub_vmsd->subsections); ret = vmstate_load_state(f, sub_vmsd, opaque, version_id); and this in vmstate_load_state: if (field->flags & VMS_STRUCT) { + assert (!vmsd->subsections); ret = vmstate_load_state(f, field->vmsd, addr, field->vmsd->version_id); } Paolo
2011/1/20 Paolo Bonzini <pbonzini@redhat.com>: > On 12/14/2010 10:07 AM, Yoshiaki Tamura wrote: >> >> Although it's rare to happen in live migration, when the head of a >> byte stream contains 0x05 > > IIUC, this happens if you have VMS_STRUCT and the field after the VMS_STRUCT > starts with 0x5. > > I think you should also add this in vmstate_subsection_load: > > sub_vmsd = vmstate_get_subsection(sub, idstr); > if (sub_vmsd == NULL) { > return -ENOENT; > } > + assert (!sub_vmsd->subsections); > ret = vmstate_load_state(f, sub_vmsd, opaque, version_id); > > and this in vmstate_load_state: > > if (field->flags & VMS_STRUCT) { > + assert (!vmsd->subsections); > ret = vmstate_load_state(f, field->vmsd, addr, > field->vmsd->version_id); > } Hi Paolo, You mean, having subsection nested and under VMS_STRUCT are violations? Yoshi > > Paolo > >
On 01/20/2011 09:57 AM, Yoshiaki Tamura wrote: > 2011/1/20 Paolo Bonzini<pbonzini@redhat.com>: >> On 12/14/2010 10:07 AM, Yoshiaki Tamura wrote: >>> >>> Although it's rare to happen in live migration, when the head of a >>> byte stream contains 0x05 >> >> IIUC, this happens if you have VMS_STRUCT and the field after the VMS_STRUCT >> starts with 0x5. >> >> I think you should also add this in vmstate_subsection_load: >> >> sub_vmsd = vmstate_get_subsection(sub, idstr); >> if (sub_vmsd == NULL) { >> return -ENOENT; >> } >> + assert (!sub_vmsd->subsections); >> ret = vmstate_load_state(f, sub_vmsd, opaque, version_id); >> >> and this in vmstate_load_state: >> >> if (field->flags& VMS_STRUCT) { >> + assert (!vmsd->subsections); >> ret = vmstate_load_state(f, field->vmsd, addr, >> field->vmsd->version_id); >> } > > Hi Paolo, > > You mean, having subsection nested and under VMS_STRUCT are > violations? I believe so, because the protocol doesn't allow you to distinguish: - in the case of nested subsections, whether 2 consecutive subsections are siblings, or the second is nested into the first. In fact, your patch also fixes a latent bug in case a device supports more than one subsection, and both are present in the data stream. When vmstate_subsection_load is called for the first subsection, it would see a 0x5 byte (the beginning of the second subsection), eat it and then fail with ENOENT. The second subsection would then fail to load. - in the case of VMS_STRUCT, whether a 0x5 byte after the VMS_STRUCT is a subsection or part of the parent data stream. This is, I believe, the source of your bug. I don't think it is possible to fix these problems in the file format while preserving backwards compatibility with pre-subsection QEMU (which was a fundamental requirement of subsections). So, I think your patch is correct and fixes the practical bugs. However, we can be even stronger and assert that the problematic vmstate descriptions are not used. Even better, asserts matching the ones above could be added to vmstate_subsection_save and vmstate_save_state, as well. Paolo
2011/1/20 Paolo Bonzini <pbonzini@redhat.com>: > On 01/20/2011 09:57 AM, Yoshiaki Tamura wrote: >> >> 2011/1/20 Paolo Bonzini<pbonzini@redhat.com>: >>> >>> On 12/14/2010 10:07 AM, Yoshiaki Tamura wrote: >>>> >>>> Although it's rare to happen in live migration, when the head of a >>>> byte stream contains 0x05 >>> >>> IIUC, this happens if you have VMS_STRUCT and the field after the >>> VMS_STRUCT >>> starts with 0x5. >>> >>> I think you should also add this in vmstate_subsection_load: >>> >>> sub_vmsd = vmstate_get_subsection(sub, idstr); >>> if (sub_vmsd == NULL) { >>> return -ENOENT; >>> } >>> + assert (!sub_vmsd->subsections); >>> ret = vmstate_load_state(f, sub_vmsd, opaque, version_id); >>> >>> and this in vmstate_load_state: >>> >>> if (field->flags& VMS_STRUCT) { >>> + assert (!vmsd->subsections); >>> ret = vmstate_load_state(f, field->vmsd, addr, >>> field->vmsd->version_id); >>> } >> >> Hi Paolo, >> >> You mean, having subsection nested and under VMS_STRUCT are >> violations? > > I believe so, because the protocol doesn't allow you to distinguish: > > - in the case of nested subsections, whether 2 consecutive subsections are > siblings, or the second is nested into the first. In fact, your patch also > fixes a latent bug in case a device supports more than one subsection, and > both are present in the data stream. When vmstate_subsection_load is called > for the first subsection, it would see a 0x5 byte (the beginning of the > second subsection), eat it and then fail with ENOENT. The second subsection > would then fail to load. > > - in the case of VMS_STRUCT, whether a 0x5 byte after the VMS_STRUCT is a > subsection or part of the parent data stream. This is, I believe, the > source of your bug. Thank you for the explanation. It's very helpful because I didn't know the background of subsection. Kemari is kind of stress test of live migration. > I don't think it is possible to fix these problems in the file format while > preserving backwards compatibility with pre-subsection QEMU (which was a > fundamental requirement of subsections). So, I think your patch is correct > and fixes the practical bugs. However, we can be even stronger and assert > that the problematic vmstate descriptions are not used. > > Even better, asserts matching the ones above could be added to > vmstate_subsection_save and vmstate_save_state, as well. I see. Let me fold the assertion you pointed to the original patch for now. Because I'm not an expert in subsection, I would like to leave further improvements to the others. Yoshi > > Paolo > >
2011/1/20 Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>: > 2011/1/20 Paolo Bonzini <pbonzini@redhat.com>: >> On 01/20/2011 09:57 AM, Yoshiaki Tamura wrote: >>> >>> 2011/1/20 Paolo Bonzini<pbonzini@redhat.com>: >>>> >>>> On 12/14/2010 10:07 AM, Yoshiaki Tamura wrote: >>>>> >>>>> Although it's rare to happen in live migration, when the head of a >>>>> byte stream contains 0x05 >>>> >>>> IIUC, this happens if you have VMS_STRUCT and the field after the >>>> VMS_STRUCT >>>> starts with 0x5. >>>> >>>> I think you should also add this in vmstate_subsection_load: >>>> >>>> sub_vmsd = vmstate_get_subsection(sub, idstr); >>>> if (sub_vmsd == NULL) { >>>> return -ENOENT; >>>> } >>>> + assert (!sub_vmsd->subsections); >>>> ret = vmstate_load_state(f, sub_vmsd, opaque, version_id); >>>> >>>> and this in vmstate_load_state: >>>> >>>> if (field->flags& VMS_STRUCT) { >>>> + assert (!vmsd->subsections); >>>> ret = vmstate_load_state(f, field->vmsd, addr, >>>> field->vmsd->version_id); >>>> } >>> >>> Hi Paolo, >>> >>> You mean, having subsection nested and under VMS_STRUCT are >>> violations? >> >> I believe so, because the protocol doesn't allow you to distinguish: >> >> - in the case of nested subsections, whether 2 consecutive subsections are >> siblings, or the second is nested into the first. In fact, your patch also >> fixes a latent bug in case a device supports more than one subsection, and >> both are present in the data stream. When vmstate_subsection_load is called >> for the first subsection, it would see a 0x5 byte (the beginning of the >> second subsection), eat it and then fail with ENOENT. The second subsection >> would then fail to load. >> >> - in the case of VMS_STRUCT, whether a 0x5 byte after the VMS_STRUCT is a >> subsection or part of the parent data stream. This is, I believe, the >> source of your bug. > > Thank you for the explanation. It's very helpful because I > didn't know the background of subsection. Kemari is kind of > stress test of live migration. > >> I don't think it is possible to fix these problems in the file format while >> preserving backwards compatibility with pre-subsection QEMU (which was a >> fundamental requirement of subsections). So, I think your patch is correct >> and fixes the practical bugs. However, we can be even stronger and assert >> that the problematic vmstate descriptions are not used. >> >> Even better, asserts matching the ones above could be added to >> vmstate_subsection_save and vmstate_save_state, as well. if (field->flags & VMS_STRUCT) { + assert (!vmsd->subsections); ret = vmstate_load_state(f, field->vmsd, addr, field->vmsd->version_id); } This assertion always fail for "cpu" in my environment. Yoshi > I see. Let me fold the assertion you pointed to the original > patch for now. Because I'm not an expert in subsection, I would > like to leave further improvements to the others. > > Yoshi > >> >> Paolo >> >> >
diff --git a/savevm.c b/savevm.c index d38f79e..72f6249 100644 --- a/savevm.c +++ b/savevm.c @@ -1633,6 +1633,12 @@ static const VMStateDescription *vmstate_get_subsection(const VMStateSubsection static int vmstate_subsection_load(QEMUFile *f, const VMStateDescription *vmsd, void *opaque) { + const VMStateSubsection *sub = vmsd->subsections; + + if (!sub || !sub->needed) { + return 0; + } + while (qemu_peek_byte(f) == QEMU_VM_SUBSECTION) { char idstr[256]; int ret; @@ -1645,7 +1651,7 @@ static int vmstate_subsection_load(QEMUFile *f, const VMStateDescription *vmsd, idstr[len] = 0; version_id = qemu_get_be32(f); - sub_vmsd = vmstate_get_subsection(vmsd->subsections, idstr); + sub_vmsd = vmstate_get_subsection(sub, idstr); if (sub_vmsd == NULL) { return -ENOENT; }
Although it's rare to happen in live migration, when the head of a byte stream contains 0x05 which is the marker of subsection, the loader gets corrupted because vmstate_subsection_load() continues even the device doesn't require it. This patch adds a checker whether subsection is needed, and skips following routines if not needed. Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> --- savevm.c | 8 +++++++- 1 files changed, 7 insertions(+), 1 deletions(-)