Message ID | 20200327152930.66636-1-david@redhat.com |
---|---|
State | New |
Headers | show |
Series | [v1] s390x: Reject unaligned RAM sizes | expand |
On 27.03.20 16:29, David Hildenbrand wrote: > Historically, we fixed up the RAM size (rounded it down), to fit into > storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use > memdev for RAM"), we no longer consider the fixed-up size when > allcoating the RAM block - which will break migration. > > Let's simply drop that manual fixup code and let the user supply sane > RAM sizes. This will bail out early when trying to migrate (and make > an existing guest with e.g., 12345 MB non-migratable), but maybe we > should have rejected such RAM sizes right from the beginning. > > As we no longer fixup maxram_size as well, make other users use ram_size > instead. Keep using maxram_size when setting the maximum ram size in KVM, > as that will come in handy in the future when supporting memory hotplug > (in contrast, storage keys and storage attributes for hotplugged memory > will have to be migrated per RAM block in the future). > > This fixes (or rather rejects early): > > 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the > RAM block size changed. Not sure I like this variant. Instead of breaking migration (that was accidentially done by Igors changes) we now reject migration from older QEMUs to 5.0. This is not going to help those that still have such guests running and want to migrate. > > 2. Migrating upstream QEMU to upstream QEMU (e.g., with "-m 1235M"), as > we receive storage attributes for memory we don't expect (as we fixed up > ram_size and maxram_size). > > Fixes: 3a12fc61af5c ("390x/s390-virtio-ccw: use memdev for RAM") > Reported-by: Lukáš Doktor <ldoktor@redhat.com> > Cc: Igor Mammedov <imammedo@redhat.com> > Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> > Signed-off-by: David Hildenbrand <david@redhat.com>
On 27.03.20 16:34, Christian Borntraeger wrote: > > > On 27.03.20 16:29, David Hildenbrand wrote: >> Historically, we fixed up the RAM size (rounded it down), to fit into >> storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use >> memdev for RAM"), we no longer consider the fixed-up size when >> allcoating the RAM block - which will break migration. >> >> Let's simply drop that manual fixup code and let the user supply sane >> RAM sizes. This will bail out early when trying to migrate (and make >> an existing guest with e.g., 12345 MB non-migratable), but maybe we >> should have rejected such RAM sizes right from the beginning. >> >> As we no longer fixup maxram_size as well, make other users use ram_size >> instead. Keep using maxram_size when setting the maximum ram size in KVM, >> as that will come in handy in the future when supporting memory hotplug >> (in contrast, storage keys and storage attributes for hotplugged memory >> will have to be migrated per RAM block in the future). >> >> This fixes (or rather rejects early): >> >> 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the >> RAM block size changed. > > Not sure I like this variant. Instead of breaking migration (that was > accidentially done by Igors changes) we now reject migration from older > QEMUs to 5.0. This is not going to help those that still have such guests > running and want to migrate. As Igor mentioned on another channel, you most probably can migrate an older guest by starting it on the target with a fixed-up size. E.g., migrate an old QEMU "-m 1235M" to a new QEMU "-m 1234M" Not sure how many such weird-size VMs we actually do have in practice.
On 27.03.20 17:01, David Hildenbrand wrote: > On 27.03.20 16:34, Christian Borntraeger wrote: >> >> >> On 27.03.20 16:29, David Hildenbrand wrote: >>> Historically, we fixed up the RAM size (rounded it down), to fit into >>> storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use >>> memdev for RAM"), we no longer consider the fixed-up size when >>> allcoating the RAM block - which will break migration. >>> >>> Let's simply drop that manual fixup code and let the user supply sane >>> RAM sizes. This will bail out early when trying to migrate (and make >>> an existing guest with e.g., 12345 MB non-migratable), but maybe we >>> should have rejected such RAM sizes right from the beginning. >>> >>> As we no longer fixup maxram_size as well, make other users use ram_size >>> instead. Keep using maxram_size when setting the maximum ram size in KVM, >>> as that will come in handy in the future when supporting memory hotplug >>> (in contrast, storage keys and storage attributes for hotplugged memory >>> will have to be migrated per RAM block in the future). >>> >>> This fixes (or rather rejects early): >>> >>> 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the >>> RAM block size changed. >> >> Not sure I like this variant. Instead of breaking migration (that was >> accidentially done by Igors changes) we now reject migration from older >> QEMUs to 5.0. This is not going to help those that still have such guests >> running and want to migrate. > > As Igor mentioned on another channel, you most probably can migrate an > older guest by starting it on the target with a fixed-up size. > > E.g., migrate an old QEMU "-m 1235M" to a new QEMU "-m 1234M" Yes, that should probably work. > > Not sure how many such weird-size VMs we actually do have in practice. I am worried about some automated deployments where tooling has created these sizes for dozens or hundreds of containers in VMS and so.
On Fri, 27 Mar 2020 17:05:34 +0100 Christian Borntraeger <borntraeger@de.ibm.com> wrote: > On 27.03.20 17:01, David Hildenbrand wrote: > > On 27.03.20 16:34, Christian Borntraeger wrote: > >> > >> > >> On 27.03.20 16:29, David Hildenbrand wrote: > >>> Historically, we fixed up the RAM size (rounded it down), to fit into > >>> storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use > >>> memdev for RAM"), we no longer consider the fixed-up size when > >>> allcoating the RAM block - which will break migration. > >>> > >>> Let's simply drop that manual fixup code and let the user supply sane > >>> RAM sizes. This will bail out early when trying to migrate (and make > >>> an existing guest with e.g., 12345 MB non-migratable), but maybe we > >>> should have rejected such RAM sizes right from the beginning. > >>> > >>> As we no longer fixup maxram_size as well, make other users use ram_size > >>> instead. Keep using maxram_size when setting the maximum ram size in KVM, > >>> as that will come in handy in the future when supporting memory hotplug > >>> (in contrast, storage keys and storage attributes for hotplugged memory > >>> will have to be migrated per RAM block in the future). > >>> > >>> This fixes (or rather rejects early): > >>> > >>> 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the > >>> RAM block size changed. > >> > >> Not sure I like this variant. Instead of breaking migration (that was > >> accidentially done by Igors changes) we now reject migration from older > >> QEMUs to 5.0. This is not going to help those that still have such guests > >> running and want to migrate. > > > > As Igor mentioned on another channel, you most probably can migrate an > > older guest by starting it on the target with a fixed-up size. > > > > E.g., migrate an old QEMU "-m 1235M" to a new QEMU "-m 1234M" > > Yes, that should probably work. I'm in process of testing it. > > Not sure how many such weird-size VMs we actually do have in practice. > > I am worried about some automated deployments where tooling has created > these sizes for dozens or hundreds of containers in VMS and so. Yep, it's possible but then that tooling/configs should be fixed to work with new QEMU that validates user's input.
On Fri, 27 Mar 2020 16:29:30 +0100 David Hildenbrand <david@redhat.com> wrote: > Historically, we fixed up the RAM size (rounded it down), to fit into > storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use > memdev for RAM"), we no longer consider the fixed-up size when > allcoating the RAM block - which will break migration. > > Let's simply drop that manual fixup code and let the user supply sane > RAM sizes. This will bail out early when trying to migrate (and make > an existing guest with e.g., 12345 MB non-migratable), but maybe we > should have rejected such RAM sizes right from the beginning. > > As we no longer fixup maxram_size as well, make other users use ram_size > instead. Keep using maxram_size when setting the maximum ram size in KVM, > as that will come in handy in the future when supporting memory hotplug > (in contrast, storage keys and storage attributes for hotplugged memory > will have to be migrated per RAM block in the future). > > This fixes (or rather rejects early): > > 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the > RAM block size changed. > > 2. Migrating upstream QEMU to upstream QEMU (e.g., with "-m 1235M"), as > we receive storage attributes for memory we don't expect (as we fixed up > ram_size and maxram_size). > > Fixes: 3a12fc61af5c ("390x/s390-virtio-ccw: use memdev for RAM") > Reported-by: Lukáš Doktor <ldoktor@redhat.com> > Cc: Igor Mammedov <imammedo@redhat.com> > Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> > Signed-off-by: David Hildenbrand <david@redhat.com> > --- > hw/s390x/s390-skeys.c | 4 +--- > hw/s390x/s390-stattrib-kvm.c | 7 ++----- > hw/s390x/sclp.c | 21 +++++++++++---------- > 3 files changed, 14 insertions(+), 18 deletions(-) > > diff --git a/hw/s390x/s390-skeys.c b/hw/s390x/s390-skeys.c > index 5da6e5292f..2545b1576b 100644 > --- a/hw/s390x/s390-skeys.c > +++ b/hw/s390x/s390-skeys.c > @@ -11,7 +11,6 @@ > > #include "qemu/osdep.h" > #include "qemu/units.h" > -#include "hw/boards.h" > #include "hw/s390x/storage-keys.h" > #include "qapi/error.h" > #include "qapi/qapi-commands-misc-target.h" > @@ -174,9 +173,8 @@ out: > static void qemu_s390_skeys_init(Object *obj) > { > QEMUS390SKeysState *skeys = QEMU_S390_SKEYS(obj); > - MachineState *machine = MACHINE(qdev_get_machine()); > > - skeys->key_count = machine->maxram_size / TARGET_PAGE_SIZE; > + skeys->key_count = ram_size / TARGET_PAGE_SIZE; why are you dropping machine->foo all around and switching to global ram_size? (I'd rather do other way around) > skeys->keydata = g_malloc0(skeys->key_count); > } > > diff --git a/hw/s390x/s390-stattrib-kvm.c b/hw/s390x/s390-stattrib-kvm.c > index c7e1f35524..ae88fbc32e 100644 > --- a/hw/s390x/s390-stattrib-kvm.c > +++ b/hw/s390x/s390-stattrib-kvm.c > @@ -10,7 +10,6 @@ > */ > > #include "qemu/osdep.h" > -#include "hw/boards.h" > #include "migration/qemu-file.h" > #include "hw/s390x/storage-attributes.h" > #include "qemu/error-report.h" > @@ -84,8 +83,7 @@ static int kvm_s390_stattrib_set_stattr(S390StAttribState *sa, > uint8_t *values) > { > KVMS390StAttribState *sas = KVM_S390_STATTRIB(sa); > - MachineState *machine = MACHINE(qdev_get_machine()); > - unsigned long max = machine->maxram_size / TARGET_PAGE_SIZE; > + unsigned long max = ram_size / TARGET_PAGE_SIZE; > > if (start_gfn + count > max) { > error_report("Out of memory bounds when setting storage attributes"); > @@ -103,8 +101,7 @@ static int kvm_s390_stattrib_set_stattr(S390StAttribState *sa, > static void kvm_s390_stattrib_synchronize(S390StAttribState *sa) > { > KVMS390StAttribState *sas = KVM_S390_STATTRIB(sa); > - MachineState *machine = MACHINE(qdev_get_machine()); > - unsigned long max = machine->maxram_size / TARGET_PAGE_SIZE; > + unsigned long max = ram_size / TARGET_PAGE_SIZE; > /* We do not need to reach the maximum buffer size allowed */ > unsigned long cx, len = KVM_S390_SKEYS_MAX / 2; > int r; > diff --git a/hw/s390x/sclp.c b/hw/s390x/sclp.c > index af0bfbc2ec..6af471fb3f 100644 > --- a/hw/s390x/sclp.c > +++ b/hw/s390x/sclp.c > @@ -326,8 +326,7 @@ out: > > static void sclp_memory_init(SCLPDevice *sclp) > { > - MachineState *machine = MACHINE(qdev_get_machine()); > - ram_addr_t initial_mem = machine->ram_size; > + uint64_t initial_mem = ram_size; > int increment_size = 20; > > /* The storage increment size is a multiple of 1M and is a power of 2. > @@ -339,15 +338,17 @@ static void sclp_memory_init(SCLPDevice *sclp) > } > sclp->increment_size = increment_size; > > - /* The core memory area needs to be aligned with the increment size. > - * In effect, this can cause the user-specified memory size to be rounded > - * down to align with the nearest increment boundary. */ > + /* > + * The core memory area needs to be aligned to the increment size. In > + * case it's not aligned, bail out. > + */ > initial_mem = initial_mem >> increment_size << increment_size; > - > - machine->ram_size = initial_mem; > - machine->maxram_size = initial_mem; > - /* let's propagate the changed ram size into the global variable. */ > - ram_size = initial_mem; > + if (initial_mem != ram_size) { > + error_report("RAM size not aligned to storage increments." > + " Possible aligned RAM size: %" PRIu64 " MB", > + initial_mem / MiB); > + exit(1); > + } > } > > static void sclp_init(Object *obj)
On 27.03.20 17:48, Igor Mammedov wrote: > On Fri, 27 Mar 2020 16:29:30 +0100 > David Hildenbrand <david@redhat.com> wrote: > >> Historically, we fixed up the RAM size (rounded it down), to fit into >> storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use >> memdev for RAM"), we no longer consider the fixed-up size when >> allcoating the RAM block - which will break migration. >> >> Let's simply drop that manual fixup code and let the user supply sane >> RAM sizes. This will bail out early when trying to migrate (and make >> an existing guest with e.g., 12345 MB non-migratable), but maybe we >> should have rejected such RAM sizes right from the beginning. >> >> As we no longer fixup maxram_size as well, make other users use ram_size >> instead. Keep using maxram_size when setting the maximum ram size in KVM, >> as that will come in handy in the future when supporting memory hotplug >> (in contrast, storage keys and storage attributes for hotplugged memory >> will have to be migrated per RAM block in the future). >> >> This fixes (or rather rejects early): >> >> 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the >> RAM block size changed. >> >> 2. Migrating upstream QEMU to upstream QEMU (e.g., with "-m 1235M"), as >> we receive storage attributes for memory we don't expect (as we fixed up >> ram_size and maxram_size). >> >> Fixes: 3a12fc61af5c ("390x/s390-virtio-ccw: use memdev for RAM") >> Reported-by: Lukáš Doktor <ldoktor@redhat.com> >> Cc: Igor Mammedov <imammedo@redhat.com> >> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> >> Signed-off-by: David Hildenbrand <david@redhat.com> >> --- >> hw/s390x/s390-skeys.c | 4 +--- >> hw/s390x/s390-stattrib-kvm.c | 7 ++----- >> hw/s390x/sclp.c | 21 +++++++++++---------- >> 3 files changed, 14 insertions(+), 18 deletions(-) >> >> diff --git a/hw/s390x/s390-skeys.c b/hw/s390x/s390-skeys.c >> index 5da6e5292f..2545b1576b 100644 >> --- a/hw/s390x/s390-skeys.c >> +++ b/hw/s390x/s390-skeys.c >> @@ -11,7 +11,6 @@ >> >> #include "qemu/osdep.h" >> #include "qemu/units.h" >> -#include "hw/boards.h" >> #include "hw/s390x/storage-keys.h" >> #include "qapi/error.h" >> #include "qapi/qapi-commands-misc-target.h" >> @@ -174,9 +173,8 @@ out: >> static void qemu_s390_skeys_init(Object *obj) >> { >> QEMUS390SKeysState *skeys = QEMU_S390_SKEYS(obj); >> - MachineState *machine = MACHINE(qdev_get_machine()); >> >> - skeys->key_count = machine->maxram_size / TARGET_PAGE_SIZE; >> + skeys->key_count = ram_size / TARGET_PAGE_SIZE; > > why are you dropping machine->foo all around and switching to global ram_size? > (I'd rather do other way around) Not sure what the latest best practice is. I can also simply convert to machine->ram_size if that's the right thing to do.
On 27.03.20 17:46, Igor Mammedov wrote: > On Fri, 27 Mar 2020 17:05:34 +0100 > Christian Borntraeger <borntraeger@de.ibm.com> wrote: > >> On 27.03.20 17:01, David Hildenbrand wrote: >>> On 27.03.20 16:34, Christian Borntraeger wrote: >>>> >>>> >>>> On 27.03.20 16:29, David Hildenbrand wrote: >>>>> Historically, we fixed up the RAM size (rounded it down), to fit into >>>>> storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use >>>>> memdev for RAM"), we no longer consider the fixed-up size when >>>>> allcoating the RAM block - which will break migration. >>>>> >>>>> Let's simply drop that manual fixup code and let the user supply sane >>>>> RAM sizes. This will bail out early when trying to migrate (and make >>>>> an existing guest with e.g., 12345 MB non-migratable), but maybe we >>>>> should have rejected such RAM sizes right from the beginning. >>>>> >>>>> As we no longer fixup maxram_size as well, make other users use ram_size >>>>> instead. Keep using maxram_size when setting the maximum ram size in KVM, >>>>> as that will come in handy in the future when supporting memory hotplug >>>>> (in contrast, storage keys and storage attributes for hotplugged memory >>>>> will have to be migrated per RAM block in the future). >>>>> >>>>> This fixes (or rather rejects early): >>>>> >>>>> 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the >>>>> RAM block size changed. >>>> >>>> Not sure I like this variant. Instead of breaking migration (that was >>>> accidentially done by Igors changes) we now reject migration from older >>>> QEMUs to 5.0. This is not going to help those that still have such guests >>>> running and want to migrate. >>> >>> As Igor mentioned on another channel, you most probably can migrate an >>> older guest by starting it on the target with a fixed-up size. >>> >>> E.g., migrate an old QEMU "-m 1235M" to a new QEMU "-m 1234M" >> >> Yes, that should probably work. > I'm in process of testing it. > >>> Not sure how many such weird-size VMs we actually do have in practice. >> >> I am worried about some automated deployments where tooling has created >> these sizes for dozens or hundreds of containers in VMS and so. IIRC, e.g., Kata usually uses 2048MB. Not sure about others, but I'd be surprised if it's not multiples of, say, 128MB. > Yep, it's possible but then that tooling/configs should be fixed to work with > new QEMU that validates user's input. > Yeah, and mention it in the cover letter, +eventually a "fixup" table (e.g., old_size < X, has to be aligned to Y). One alternative is to have an early fixup hack in QEMU, that fixes up the sizes as we did before (and eventually warns the user). Not sure if we really want/need that.
On Fri, 27 Mar 2020 17:46:20 +0100 Igor Mammedov <imammedo@redhat.com> wrote: > On Fri, 27 Mar 2020 17:05:34 +0100 > Christian Borntraeger <borntraeger@de.ibm.com> wrote: > > > On 27.03.20 17:01, David Hildenbrand wrote: > > > On 27.03.20 16:34, Christian Borntraeger wrote: > > >> > > >> > > >> On 27.03.20 16:29, David Hildenbrand wrote: > > >>> Historically, we fixed up the RAM size (rounded it down), to fit into > > >>> storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use > > >>> memdev for RAM"), we no longer consider the fixed-up size when > > >>> allcoating the RAM block - which will break migration. > > >>> > > >>> Let's simply drop that manual fixup code and let the user supply sane > > >>> RAM sizes. This will bail out early when trying to migrate (and make > > >>> an existing guest with e.g., 12345 MB non-migratable), but maybe we > > >>> should have rejected such RAM sizes right from the beginning. > > >>> > > >>> As we no longer fixup maxram_size as well, make other users use ram_size > > >>> instead. Keep using maxram_size when setting the maximum ram size in KVM, > > >>> as that will come in handy in the future when supporting memory hotplug > > >>> (in contrast, storage keys and storage attributes for hotplugged memory > > >>> will have to be migrated per RAM block in the future). > > >>> > > >>> This fixes (or rather rejects early): > > >>> > > >>> 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the > > >>> RAM block size changed. > > >> > > >> Not sure I like this variant. Instead of breaking migration (that was > > >> accidentially done by Igors changes) we now reject migration from older > > >> QEMUs to 5.0. This is not going to help those that still have such guests > > >> running and want to migrate. > > > > > > As Igor mentioned on another channel, you most probably can migrate an > > > older guest by starting it on the target with a fixed-up size. > > > > > > E.g., migrate an old QEMU "-m 1235M" to a new QEMU "-m 1234M" > > > > Yes, that should probably work. > I'm in process of testing it. > > > > Not sure how many such weird-size VMs we actually do have in practice. > > > > I am worried about some automated deployments where tooling has created > > these sizes for dozens or hundreds of containers in VMS and so. > Yep, it's possible but then that tooling/configs should be fixed to work with > new QEMU that validates user's input. > @David: I'm a little confused. Is this fix about adding user input validation, or is it about changing what valid inputs are? I don't see this alignment requirement documented, so my guess is the latter. And then, I'm not sure I'm sold on this. Regards, Halil
On 27.03.20 19:16, Halil Pasic wrote: > On Fri, 27 Mar 2020 17:46:20 +0100 > Igor Mammedov <imammedo@redhat.com> wrote: > >> On Fri, 27 Mar 2020 17:05:34 +0100 >> Christian Borntraeger <borntraeger@de.ibm.com> wrote: >> >>> On 27.03.20 17:01, David Hildenbrand wrote: >>>> On 27.03.20 16:34, Christian Borntraeger wrote: >>>>> >>>>> >>>>> On 27.03.20 16:29, David Hildenbrand wrote: >>>>>> Historically, we fixed up the RAM size (rounded it down), to fit into >>>>>> storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use >>>>>> memdev for RAM"), we no longer consider the fixed-up size when >>>>>> allcoating the RAM block - which will break migration. >>>>>> >>>>>> Let's simply drop that manual fixup code and let the user supply sane >>>>>> RAM sizes. This will bail out early when trying to migrate (and make >>>>>> an existing guest with e.g., 12345 MB non-migratable), but maybe we >>>>>> should have rejected such RAM sizes right from the beginning. >>>>>> >>>>>> As we no longer fixup maxram_size as well, make other users use ram_size >>>>>> instead. Keep using maxram_size when setting the maximum ram size in KVM, >>>>>> as that will come in handy in the future when supporting memory hotplug >>>>>> (in contrast, storage keys and storage attributes for hotplugged memory >>>>>> will have to be migrated per RAM block in the future). >>>>>> >>>>>> This fixes (or rather rejects early): >>>>>> >>>>>> 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the >>>>>> RAM block size changed. >>>>> >>>>> Not sure I like this variant. Instead of breaking migration (that was >>>>> accidentially done by Igors changes) we now reject migration from older >>>>> QEMUs to 5.0. This is not going to help those that still have such guests >>>>> running and want to migrate. >>>> >>>> As Igor mentioned on another channel, you most probably can migrate an >>>> older guest by starting it on the target with a fixed-up size. >>>> >>>> E.g., migrate an old QEMU "-m 1235M" to a new QEMU "-m 1234M" >>> >>> Yes, that should probably work. >> I'm in process of testing it. >> >>>> Not sure how many such weird-size VMs we actually do have in practice. >>> >>> I am worried about some automated deployments where tooling has created >>> these sizes for dozens or hundreds of containers in VMS and so. >> Yep, it's possible but then that tooling/configs should be fixed to work with >> new QEMU that validates user's input. >> > > @David: I'm a little confused. Is this fix about adding user input > validation, or is it about changing what valid inputs are? We used to silently fixup the user input instead of bailing out. So it's rather the latter. It was never the right choice to silently fixup instead of just bailing out. We never should have done that. > > I don't see this alignment requirement documented, so my guess is the > latter. And then, I'm not sure I'm sold on this. Well, we can add documentation with the next release. It's something to end up in the release notes. If somebody specified "-m 1235", he always received "1234 MB" and never "1235 MB". You could also call that a BUG, which should have bailed out right from the start (there wasn't even a warning).
On Fri, 27 Mar 2020 17:51:23 +0100 David Hildenbrand <david@redhat.com> wrote: > On 27.03.20 17:48, Igor Mammedov wrote: > > On Fri, 27 Mar 2020 16:29:30 +0100 > > David Hildenbrand <david@redhat.com> wrote: > > > >> Historically, we fixed up the RAM size (rounded it down), to fit into > >> storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use > >> memdev for RAM"), we no longer consider the fixed-up size when > >> allcoating the RAM block - which will break migration. > >> > >> Let's simply drop that manual fixup code and let the user supply sane > >> RAM sizes. This will bail out early when trying to migrate (and make > >> an existing guest with e.g., 12345 MB non-migratable), but maybe we > >> should have rejected such RAM sizes right from the beginning. > >> > >> As we no longer fixup maxram_size as well, make other users use ram_size > >> instead. Keep using maxram_size when setting the maximum ram size in KVM, > >> as that will come in handy in the future when supporting memory hotplug > >> (in contrast, storage keys and storage attributes for hotplugged memory > >> will have to be migrated per RAM block in the future). > >> > >> This fixes (or rather rejects early): > >> > >> 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the > >> RAM block size changed. > >> > >> 2. Migrating upstream QEMU to upstream QEMU (e.g., with "-m 1235M"), as > >> we receive storage attributes for memory we don't expect (as we fixed up > >> ram_size and maxram_size). > >> > >> Fixes: 3a12fc61af5c ("390x/s390-virtio-ccw: use memdev for RAM") > >> Reported-by: Lukáš Doktor <ldoktor@redhat.com> > >> Cc: Igor Mammedov <imammedo@redhat.com> > >> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> > >> Signed-off-by: David Hildenbrand <david@redhat.com> > >> --- > >> hw/s390x/s390-skeys.c | 4 +--- > >> hw/s390x/s390-stattrib-kvm.c | 7 ++----- > >> hw/s390x/sclp.c | 21 +++++++++++---------- > >> 3 files changed, 14 insertions(+), 18 deletions(-) > >> > >> diff --git a/hw/s390x/s390-skeys.c b/hw/s390x/s390-skeys.c > >> index 5da6e5292f..2545b1576b 100644 > >> --- a/hw/s390x/s390-skeys.c > >> +++ b/hw/s390x/s390-skeys.c > >> @@ -11,7 +11,6 @@ > >> > >> #include "qemu/osdep.h" > >> #include "qemu/units.h" > >> -#include "hw/boards.h" > >> #include "hw/s390x/storage-keys.h" > >> #include "qapi/error.h" > >> #include "qapi/qapi-commands-misc-target.h" > >> @@ -174,9 +173,8 @@ out: > >> static void qemu_s390_skeys_init(Object *obj) > >> { > >> QEMUS390SKeysState *skeys = QEMU_S390_SKEYS(obj); > >> - MachineState *machine = MACHINE(qdev_get_machine()); > >> > >> - skeys->key_count = machine->maxram_size / TARGET_PAGE_SIZE; > >> + skeys->key_count = ram_size / TARGET_PAGE_SIZE; > > > > why are you dropping machine->foo all around and switching to global ram_size? > > (I'd rather do other way around) > > Not sure what the latest best practice is. I can also simply convert to > machine->ram_size if that's the right thing to do. My understanding of it was not to use globals if possible. (I planned on removing global ram_size an leave only machine->ram_size but that a bit tricky since things tend to explode once a global touched, so it needs some more thought/patience)
On Fri, 27 Mar 2020 17:53:39 +0100 David Hildenbrand <david@redhat.com> wrote: > On 27.03.20 17:46, Igor Mammedov wrote: > > On Fri, 27 Mar 2020 17:05:34 +0100 > > Christian Borntraeger <borntraeger@de.ibm.com> wrote: > > > >> On 27.03.20 17:01, David Hildenbrand wrote: > >>> On 27.03.20 16:34, Christian Borntraeger wrote: > >>>> > >>>> > >>>> On 27.03.20 16:29, David Hildenbrand wrote: > >>>>> Historically, we fixed up the RAM size (rounded it down), to fit into > >>>>> storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use > >>>>> memdev for RAM"), we no longer consider the fixed-up size when > >>>>> allcoating the RAM block - which will break migration. > >>>>> > >>>>> Let's simply drop that manual fixup code and let the user supply sane > >>>>> RAM sizes. This will bail out early when trying to migrate (and make > >>>>> an existing guest with e.g., 12345 MB non-migratable), but maybe we > >>>>> should have rejected such RAM sizes right from the beginning. > >>>>> > >>>>> As we no longer fixup maxram_size as well, make other users use ram_size > >>>>> instead. Keep using maxram_size when setting the maximum ram size in KVM, > >>>>> as that will come in handy in the future when supporting memory hotplug > >>>>> (in contrast, storage keys and storage attributes for hotplugged memory > >>>>> will have to be migrated per RAM block in the future). > >>>>> > >>>>> This fixes (or rather rejects early): > >>>>> > >>>>> 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the > >>>>> RAM block size changed. > >>>> > >>>> Not sure I like this variant. Instead of breaking migration (that was > >>>> accidentially done by Igors changes) we now reject migration from older > >>>> QEMUs to 5.0. This is not going to help those that still have such guests > >>>> running and want to migrate. > >>> > >>> As Igor mentioned on another channel, you most probably can migrate an > >>> older guest by starting it on the target with a fixed-up size. > >>> > >>> E.g., migrate an old QEMU "-m 1235M" to a new QEMU "-m 1234M" > >> > >> Yes, that should probably work. > > I'm in process of testing it. it works > > > >>> Not sure how many such weird-size VMs we actually do have in practice. > >> > >> I am worried about some automated deployments where tooling has created > >> these sizes for dozens or hundreds of containers in VMS and so. > > IIRC, e.g., Kata usually uses 2048MB. Not sure about others, but I'd be > surprised if it's not multiples of, say, 128MB. > > > Yep, it's possible but then that tooling/configs should be fixed to work with > > new QEMU that validates user's input. > > > > Yeah, and mention it in the cover letter, +eventually a "fixup" table > (e.g., old_size < X, has to be aligned to Y). > > One alternative is to have an early fixup hack in QEMU, that fixes up > the sizes as we did before (and eventually warns the user). Not sure if > we really want/need that. That would require at least a callback at machine level, also practice shows warnings are of no use. If someone insist on using wrong size with new QEMU, one can write a wrapper script to workaround the issue (if they don't wish to fix configs/tooling). I don't like keeping hacks to fix user mistakes in QEMU and on top of that adding infrastructure for that (QEMU is already too much complicated). Since in this case it break migration only partially, it's sufficient to print error message that suggest correct size to use (like it's been done for other boards, s390 is the last remaining that is doing ram_size fixups). That way user could fix config and restart migration.
On 27.03.20 23:13, Igor Mammedov wrote: > On Fri, 27 Mar 2020 17:53:39 +0100 > David Hildenbrand <david@redhat.com> wrote: > >> On 27.03.20 17:46, Igor Mammedov wrote: >>> On Fri, 27 Mar 2020 17:05:34 +0100 >>> Christian Borntraeger <borntraeger@de.ibm.com> wrote: >>> >>>> On 27.03.20 17:01, David Hildenbrand wrote: >>>>> On 27.03.20 16:34, Christian Borntraeger wrote: >>>>>> >>>>>> >>>>>> On 27.03.20 16:29, David Hildenbrand wrote: >>>>>>> Historically, we fixed up the RAM size (rounded it down), to fit into >>>>>>> storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use >>>>>>> memdev for RAM"), we no longer consider the fixed-up size when >>>>>>> allcoating the RAM block - which will break migration. >>>>>>> >>>>>>> Let's simply drop that manual fixup code and let the user supply sane >>>>>>> RAM sizes. This will bail out early when trying to migrate (and make >>>>>>> an existing guest with e.g., 12345 MB non-migratable), but maybe we >>>>>>> should have rejected such RAM sizes right from the beginning. >>>>>>> >>>>>>> As we no longer fixup maxram_size as well, make other users use ram_size >>>>>>> instead. Keep using maxram_size when setting the maximum ram size in KVM, >>>>>>> as that will come in handy in the future when supporting memory hotplug >>>>>>> (in contrast, storage keys and storage attributes for hotplugged memory >>>>>>> will have to be migrated per RAM block in the future). >>>>>>> >>>>>>> This fixes (or rather rejects early): >>>>>>> >>>>>>> 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the >>>>>>> RAM block size changed. >>>>>> >>>>>> Not sure I like this variant. Instead of breaking migration (that was >>>>>> accidentially done by Igors changes) we now reject migration from older >>>>>> QEMUs to 5.0. This is not going to help those that still have such guests >>>>>> running and want to migrate. >>>>> >>>>> As Igor mentioned on another channel, you most probably can migrate an >>>>> older guest by starting it on the target with a fixed-up size. >>>>> >>>>> E.g., migrate an old QEMU "-m 1235M" to a new QEMU "-m 1234M" >>>> >>>> Yes, that should probably work. >>> I'm in process of testing it. > > it works > >>> >>>>> Not sure how many such weird-size VMs we actually do have in practice. >>>> >>>> I am worried about some automated deployments where tooling has created >>>> these sizes for dozens or hundreds of containers in VMS and so. >> >> IIRC, e.g., Kata usually uses 2048MB. Not sure about others, but I'd be >> surprised if it's not multiples of, say, 128MB. >> >>> Yep, it's possible but then that tooling/configs should be fixed to work with >>> new QEMU that validates user's input. >>> >> >> Yeah, and mention it in the cover letter, +eventually a "fixup" table >> (e.g., old_size < X, has to be aligned to Y). >> >> One alternative is to have an early fixup hack in QEMU, that fixes up >> the sizes as we did before (and eventually warns the user). Not sure if >> we really want/need that. > That would require at least a callback at machine level, > also practice shows warnings are of no use. I would strongly prefer to not break setups that used to work and do an early fixup (a machine callback). I will have a look. > > If someone insist on using wrong size with new QEMU, one can write a wrapper > script to workaround the issue (if they don't wish to fix configs/tooling). > I don't like keeping hacks to fix user mistakes in QEMU and on top of that > adding infrastructure for that (QEMU is already too much complicated). > Since in this case it break migration only partially, it's sufficient > to print error message that suggest correct size to use (like it's been done > for other boards, s390 is the last remaining that is doing ram_size fixups). > That way user could fix config and restart migration. >
On Tue, 31 Mar 2020 13:17:38 +0200 Christian Borntraeger <borntraeger@de.ibm.com> wrote: > On 27.03.20 23:13, Igor Mammedov wrote: > > On Fri, 27 Mar 2020 17:53:39 +0100 > > David Hildenbrand <david@redhat.com> wrote: > > > >> On 27.03.20 17:46, Igor Mammedov wrote: > >>> On Fri, 27 Mar 2020 17:05:34 +0100 > >>> Christian Borntraeger <borntraeger@de.ibm.com> wrote: > >>> > >>>> On 27.03.20 17:01, David Hildenbrand wrote: > >>>>> On 27.03.20 16:34, Christian Borntraeger wrote: > >>>>>> > >>>>>> > >>>>>> On 27.03.20 16:29, David Hildenbrand wrote: > >>>>>>> Historically, we fixed up the RAM size (rounded it down), to fit into > >>>>>>> storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use > >>>>>>> memdev for RAM"), we no longer consider the fixed-up size when > >>>>>>> allcoating the RAM block - which will break migration. > >>>>>>> > >>>>>>> Let's simply drop that manual fixup code and let the user supply sane > >>>>>>> RAM sizes. This will bail out early when trying to migrate (and make > >>>>>>> an existing guest with e.g., 12345 MB non-migratable), but maybe we > >>>>>>> should have rejected such RAM sizes right from the beginning. > >>>>>>> > >>>>>>> As we no longer fixup maxram_size as well, make other users use ram_size > >>>>>>> instead. Keep using maxram_size when setting the maximum ram size in KVM, > >>>>>>> as that will come in handy in the future when supporting memory hotplug > >>>>>>> (in contrast, storage keys and storage attributes for hotplugged memory > >>>>>>> will have to be migrated per RAM block in the future). > >>>>>>> > >>>>>>> This fixes (or rather rejects early): > >>>>>>> > >>>>>>> 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the > >>>>>>> RAM block size changed. > >>>>>> > >>>>>> Not sure I like this variant. Instead of breaking migration (that was > >>>>>> accidentially done by Igors changes) we now reject migration from older > >>>>>> QEMUs to 5.0. This is not going to help those that still have such guests > >>>>>> running and want to migrate. > >>>>> > >>>>> As Igor mentioned on another channel, you most probably can migrate an > >>>>> older guest by starting it on the target with a fixed-up size. > >>>>> > >>>>> E.g., migrate an old QEMU "-m 1235M" to a new QEMU "-m 1234M" > >>>> > >>>> Yes, that should probably work. > >>> I'm in process of testing it. > > > > it works > > > >>> > >>>>> Not sure how many such weird-size VMs we actually do have in practice. > >>>> > >>>> I am worried about some automated deployments where tooling has created > >>>> these sizes for dozens or hundreds of containers in VMS and so. > >> > >> IIRC, e.g., Kata usually uses 2048MB. Not sure about others, but I'd be > >> surprised if it's not multiples of, say, 128MB. > >> > >>> Yep, it's possible but then that tooling/configs should be fixed to work with > >>> new QEMU that validates user's input. > >>> > >> > >> Yeah, and mention it in the cover letter, +eventually a "fixup" table > >> (e.g., old_size < X, has to be aligned to Y). > >> > >> One alternative is to have an early fixup hack in QEMU, that fixes up > >> the sizes as we did before (and eventually warns the user). Not sure if > >> we really want/need that. > > That would require at least a callback at machine level, > > also practice shows warnings are of no use. > > I would strongly prefer to not break setups that used to work and do an early fixup > (a machine callback). I will have a look. If it were breaking migration stream or guest ABI, I'd agree with you but it isn't. So in this case, it's a bug that qemu wasn't checking size for alignment and making workaround to keep the bug around looks wrong to me. It is fixable on user side (one has to fix VM's config), so it should be fixed there and not in QEMU. And any automatic tooling that generates invalid size should be fixed as well. (I think it's not QEMU's job to mask users errors and doing something that user not asked for) > > If someone insist on using wrong size with new QEMU, one can write a wrapper > > script to workaround the issue (if they don't wish to fix configs/tooling). > > I don't like keeping hacks to fix user mistakes in QEMU and on top of that > > adding infrastructure for that (QEMU is already too much complicated). > > Since in this case it break migration only partially, it's sufficient > > to print error message that suggest correct size to use (like it's been done > > for other boards, s390 is the last remaining that is doing ram_size fixups). > > That way user could fix config and restart migration. > > > >
On 31.03.20 17:33, Igor Mammedov wrote: > On Tue, 31 Mar 2020 13:17:38 +0200 > Christian Borntraeger <borntraeger@de.ibm.com> wrote: > >> On 27.03.20 23:13, Igor Mammedov wrote: >>> On Fri, 27 Mar 2020 17:53:39 +0100 >>> David Hildenbrand <david@redhat.com> wrote: >>> >>>> On 27.03.20 17:46, Igor Mammedov wrote: >>>>> On Fri, 27 Mar 2020 17:05:34 +0100 >>>>> Christian Borntraeger <borntraeger@de.ibm.com> wrote: >>>>> >>>>>> On 27.03.20 17:01, David Hildenbrand wrote: >>>>>>> On 27.03.20 16:34, Christian Borntraeger wrote: >>>>>>>> >>>>>>>> >>>>>>>> On 27.03.20 16:29, David Hildenbrand wrote: >>>>>>>>> Historically, we fixed up the RAM size (rounded it down), to fit into >>>>>>>>> storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use >>>>>>>>> memdev for RAM"), we no longer consider the fixed-up size when >>>>>>>>> allcoating the RAM block - which will break migration. >>>>>>>>> >>>>>>>>> Let's simply drop that manual fixup code and let the user supply sane >>>>>>>>> RAM sizes. This will bail out early when trying to migrate (and make >>>>>>>>> an existing guest with e.g., 12345 MB non-migratable), but maybe we >>>>>>>>> should have rejected such RAM sizes right from the beginning. >>>>>>>>> >>>>>>>>> As we no longer fixup maxram_size as well, make other users use ram_size >>>>>>>>> instead. Keep using maxram_size when setting the maximum ram size in KVM, >>>>>>>>> as that will come in handy in the future when supporting memory hotplug >>>>>>>>> (in contrast, storage keys and storage attributes for hotplugged memory >>>>>>>>> will have to be migrated per RAM block in the future). >>>>>>>>> >>>>>>>>> This fixes (or rather rejects early): >>>>>>>>> >>>>>>>>> 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the >>>>>>>>> RAM block size changed. >>>>>>>> >>>>>>>> Not sure I like this variant. Instead of breaking migration (that was >>>>>>>> accidentially done by Igors changes) we now reject migration from older >>>>>>>> QEMUs to 5.0. This is not going to help those that still have such guests >>>>>>>> running and want to migrate. >>>>>>> >>>>>>> As Igor mentioned on another channel, you most probably can migrate an >>>>>>> older guest by starting it on the target with a fixed-up size. >>>>>>> >>>>>>> E.g., migrate an old QEMU "-m 1235M" to a new QEMU "-m 1234M" >>>>>> >>>>>> Yes, that should probably work. >>>>> I'm in process of testing it. >>> >>> it works >>> >>>>> >>>>>>> Not sure how many such weird-size VMs we actually do have in practice. >>>>>> >>>>>> I am worried about some automated deployments where tooling has created >>>>>> these sizes for dozens or hundreds of containers in VMS and so. >>>> >>>> IIRC, e.g., Kata usually uses 2048MB. Not sure about others, but I'd be >>>> surprised if it's not multiples of, say, 128MB. >>>> >>>>> Yep, it's possible but then that tooling/configs should be fixed to work with >>>>> new QEMU that validates user's input. >>>>> >>>> >>>> Yeah, and mention it in the cover letter, +eventually a "fixup" table >>>> (e.g., old_size < X, has to be aligned to Y). >>>> >>>> One alternative is to have an early fixup hack in QEMU, that fixes up >>>> the sizes as we did before (and eventually warns the user). Not sure if >>>> we really want/need that. >>> That would require at least a callback at machine level, >>> also practice shows warnings are of no use. >> >> I would strongly prefer to not break setups that used to work and do an early fixup >> (a machine callback). I will have a look. > > If it were breaking migration stream or guest ABI, > I'd agree with you but it isn't. > > So in this case, it's a bug that qemu wasn't checking > size for alignment and making workaround to keep the bug > around looks wrong to me. > It is fixable on user side (one has to fix VM's config), > so it should be fixed there and not in QEMU. > And any automatic tooling that generates invalid size > should be fixed as well. > (I think it's not QEMU's job to mask users errors and > doing something that user not asked for) Dave G mentioned, that e.g., via Cockpit it can be fairly easy to produce weird RAM sizes (via a slider). So I guess this "issue" could be more widespread than I initially thought. I agree that it's a BUG that we didn't bail out but instead decided to fix it up.
On 31.03.20 17:33, Igor Mammedov wrote: > On Tue, 31 Mar 2020 13:17:38 +0200 > Christian Borntraeger <borntraeger@de.ibm.com> wrote: > >> On 27.03.20 23:13, Igor Mammedov wrote: >>> On Fri, 27 Mar 2020 17:53:39 +0100 >>> David Hildenbrand <david@redhat.com> wrote: >>> >>>> On 27.03.20 17:46, Igor Mammedov wrote: >>>>> On Fri, 27 Mar 2020 17:05:34 +0100 >>>>> Christian Borntraeger <borntraeger@de.ibm.com> wrote: >>>>> >>>>>> On 27.03.20 17:01, David Hildenbrand wrote: >>>>>>> On 27.03.20 16:34, Christian Borntraeger wrote: >>>>>>>> >>>>>>>> >>>>>>>> On 27.03.20 16:29, David Hildenbrand wrote: >>>>>>>>> Historically, we fixed up the RAM size (rounded it down), to fit into >>>>>>>>> storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use >>>>>>>>> memdev for RAM"), we no longer consider the fixed-up size when >>>>>>>>> allcoating the RAM block - which will break migration. >>>>>>>>> >>>>>>>>> Let's simply drop that manual fixup code and let the user supply sane >>>>>>>>> RAM sizes. This will bail out early when trying to migrate (and make >>>>>>>>> an existing guest with e.g., 12345 MB non-migratable), but maybe we >>>>>>>>> should have rejected such RAM sizes right from the beginning. >>>>>>>>> >>>>>>>>> As we no longer fixup maxram_size as well, make other users use ram_size >>>>>>>>> instead. Keep using maxram_size when setting the maximum ram size in KVM, >>>>>>>>> as that will come in handy in the future when supporting memory hotplug >>>>>>>>> (in contrast, storage keys and storage attributes for hotplugged memory >>>>>>>>> will have to be migrated per RAM block in the future). >>>>>>>>> >>>>>>>>> This fixes (or rather rejects early): >>>>>>>>> >>>>>>>>> 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the >>>>>>>>> RAM block size changed. >>>>>>>> >>>>>>>> Not sure I like this variant. Instead of breaking migration (that was >>>>>>>> accidentially done by Igors changes) we now reject migration from older >>>>>>>> QEMUs to 5.0. This is not going to help those that still have such guests >>>>>>>> running and want to migrate. >>>>>>> >>>>>>> As Igor mentioned on another channel, you most probably can migrate an >>>>>>> older guest by starting it on the target with a fixed-up size. >>>>>>> >>>>>>> E.g., migrate an old QEMU "-m 1235M" to a new QEMU "-m 1234M" >>>>>> >>>>>> Yes, that should probably work. >>>>> I'm in process of testing it. >>> >>> it works >>> >>>>> >>>>>>> Not sure how many such weird-size VMs we actually do have in practice. >>>>>> >>>>>> I am worried about some automated deployments where tooling has created >>>>>> these sizes for dozens or hundreds of containers in VMS and so. >>>> >>>> IIRC, e.g., Kata usually uses 2048MB. Not sure about others, but I'd be >>>> surprised if it's not multiples of, say, 128MB. >>>> >>>>> Yep, it's possible but then that tooling/configs should be fixed to work with >>>>> new QEMU that validates user's input. >>>>> >>>> >>>> Yeah, and mention it in the cover letter, +eventually a "fixup" table >>>> (e.g., old_size < X, has to be aligned to Y). >>>> >>>> One alternative is to have an early fixup hack in QEMU, that fixes up >>>> the sizes as we did before (and eventually warns the user). Not sure if >>>> we really want/need that. >>> That would require at least a callback at machine level, >>> also practice shows warnings are of no use. >> >> I would strongly prefer to not break setups that used to work and do an early fixup >> (a machine callback). I will have a look. > > If it were breaking migration stream or guest ABI, > I'd agree with you but it isn't. > > So in this case, it's a bug that qemu wasn't checking > size for alignment and making workaround to keep the bug > around looks wrong to me. QEMU did check for alignment and fixed it up. So it was an intentional decision (maybe a bad one). Your change broke working setups and I really do not like that. See my new patch for proper handling for machine types 5.0 and later + compat handling for older machines.
On 31.03.20 17:39, David Hildenbrand wrote: > On 31.03.20 17:33, Igor Mammedov wrote: >> On Tue, 31 Mar 2020 13:17:38 +0200 >> Christian Borntraeger <borntraeger@de.ibm.com> wrote: >> >>> On 27.03.20 23:13, Igor Mammedov wrote: >>>> On Fri, 27 Mar 2020 17:53:39 +0100 >>>> David Hildenbrand <david@redhat.com> wrote: >>>> >>>>> On 27.03.20 17:46, Igor Mammedov wrote: >>>>>> On Fri, 27 Mar 2020 17:05:34 +0100 >>>>>> Christian Borntraeger <borntraeger@de.ibm.com> wrote: >>>>>> >>>>>>> On 27.03.20 17:01, David Hildenbrand wrote: >>>>>>>> On 27.03.20 16:34, Christian Borntraeger wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> On 27.03.20 16:29, David Hildenbrand wrote: >>>>>>>>>> Historically, we fixed up the RAM size (rounded it down), to fit into >>>>>>>>>> storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use >>>>>>>>>> memdev for RAM"), we no longer consider the fixed-up size when >>>>>>>>>> allcoating the RAM block - which will break migration. >>>>>>>>>> >>>>>>>>>> Let's simply drop that manual fixup code and let the user supply sane >>>>>>>>>> RAM sizes. This will bail out early when trying to migrate (and make >>>>>>>>>> an existing guest with e.g., 12345 MB non-migratable), but maybe we >>>>>>>>>> should have rejected such RAM sizes right from the beginning. >>>>>>>>>> >>>>>>>>>> As we no longer fixup maxram_size as well, make other users use ram_size >>>>>>>>>> instead. Keep using maxram_size when setting the maximum ram size in KVM, >>>>>>>>>> as that will come in handy in the future when supporting memory hotplug >>>>>>>>>> (in contrast, storage keys and storage attributes for hotplugged memory >>>>>>>>>> will have to be migrated per RAM block in the future). >>>>>>>>>> >>>>>>>>>> This fixes (or rather rejects early): >>>>>>>>>> >>>>>>>>>> 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the >>>>>>>>>> RAM block size changed. >>>>>>>>> >>>>>>>>> Not sure I like this variant. Instead of breaking migration (that was >>>>>>>>> accidentially done by Igors changes) we now reject migration from older >>>>>>>>> QEMUs to 5.0. This is not going to help those that still have such guests >>>>>>>>> running and want to migrate. >>>>>>>> >>>>>>>> As Igor mentioned on another channel, you most probably can migrate an >>>>>>>> older guest by starting it on the target with a fixed-up size. >>>>>>>> >>>>>>>> E.g., migrate an old QEMU "-m 1235M" to a new QEMU "-m 1234M" >>>>>>> >>>>>>> Yes, that should probably work. >>>>>> I'm in process of testing it. >>>> >>>> it works >>>> >>>>>> >>>>>>>> Not sure how many such weird-size VMs we actually do have in practice. >>>>>>> >>>>>>> I am worried about some automated deployments where tooling has created >>>>>>> these sizes for dozens or hundreds of containers in VMS and so. >>>>> >>>>> IIRC, e.g., Kata usually uses 2048MB. Not sure about others, but I'd be >>>>> surprised if it's not multiples of, say, 128MB. >>>>> >>>>>> Yep, it's possible but then that tooling/configs should be fixed to work with >>>>>> new QEMU that validates user's input. >>>>>> >>>>> >>>>> Yeah, and mention it in the cover letter, +eventually a "fixup" table >>>>> (e.g., old_size < X, has to be aligned to Y). >>>>> >>>>> One alternative is to have an early fixup hack in QEMU, that fixes up >>>>> the sizes as we did before (and eventually warns the user). Not sure if >>>>> we really want/need that. >>>> That would require at least a callback at machine level, >>>> also practice shows warnings are of no use. >>> >>> I would strongly prefer to not break setups that used to work and do an early fixup >>> (a machine callback). I will have a look. >> >> If it were breaking migration stream or guest ABI, >> I'd agree with you but it isn't. >> >> So in this case, it's a bug that qemu wasn't checking >> size for alignment and making workaround to keep the bug >> around looks wrong to me. >> It is fixable on user side (one has to fix VM's config), >> so it should be fixed there and not in QEMU. >> And any automatic tooling that generates invalid size >> should be fixed as well. >> (I think it's not QEMU's job to mask users errors and >> doing something that user not asked for) > > Dave G mentioned, that e.g., via Cockpit it can be fairly easy to > produce weird RAM sizes (via a slider). So I guess this "issue" could be > more widespread than I initially thought. > > I agree that it's a BUG that we didn't bail out but instead decided to > fix it up. It was a decision back than. Likely a wrong one. Now insisting on "lets breaking user setups because it violates my feeling of perfection" is not the right answer though.
diff --git a/hw/s390x/s390-skeys.c b/hw/s390x/s390-skeys.c index 5da6e5292f..2545b1576b 100644 --- a/hw/s390x/s390-skeys.c +++ b/hw/s390x/s390-skeys.c @@ -11,7 +11,6 @@ #include "qemu/osdep.h" #include "qemu/units.h" -#include "hw/boards.h" #include "hw/s390x/storage-keys.h" #include "qapi/error.h" #include "qapi/qapi-commands-misc-target.h" @@ -174,9 +173,8 @@ out: static void qemu_s390_skeys_init(Object *obj) { QEMUS390SKeysState *skeys = QEMU_S390_SKEYS(obj); - MachineState *machine = MACHINE(qdev_get_machine()); - skeys->key_count = machine->maxram_size / TARGET_PAGE_SIZE; + skeys->key_count = ram_size / TARGET_PAGE_SIZE; skeys->keydata = g_malloc0(skeys->key_count); } diff --git a/hw/s390x/s390-stattrib-kvm.c b/hw/s390x/s390-stattrib-kvm.c index c7e1f35524..ae88fbc32e 100644 --- a/hw/s390x/s390-stattrib-kvm.c +++ b/hw/s390x/s390-stattrib-kvm.c @@ -10,7 +10,6 @@ */ #include "qemu/osdep.h" -#include "hw/boards.h" #include "migration/qemu-file.h" #include "hw/s390x/storage-attributes.h" #include "qemu/error-report.h" @@ -84,8 +83,7 @@ static int kvm_s390_stattrib_set_stattr(S390StAttribState *sa, uint8_t *values) { KVMS390StAttribState *sas = KVM_S390_STATTRIB(sa); - MachineState *machine = MACHINE(qdev_get_machine()); - unsigned long max = machine->maxram_size / TARGET_PAGE_SIZE; + unsigned long max = ram_size / TARGET_PAGE_SIZE; if (start_gfn + count > max) { error_report("Out of memory bounds when setting storage attributes"); @@ -103,8 +101,7 @@ static int kvm_s390_stattrib_set_stattr(S390StAttribState *sa, static void kvm_s390_stattrib_synchronize(S390StAttribState *sa) { KVMS390StAttribState *sas = KVM_S390_STATTRIB(sa); - MachineState *machine = MACHINE(qdev_get_machine()); - unsigned long max = machine->maxram_size / TARGET_PAGE_SIZE; + unsigned long max = ram_size / TARGET_PAGE_SIZE; /* We do not need to reach the maximum buffer size allowed */ unsigned long cx, len = KVM_S390_SKEYS_MAX / 2; int r; diff --git a/hw/s390x/sclp.c b/hw/s390x/sclp.c index af0bfbc2ec..6af471fb3f 100644 --- a/hw/s390x/sclp.c +++ b/hw/s390x/sclp.c @@ -326,8 +326,7 @@ out: static void sclp_memory_init(SCLPDevice *sclp) { - MachineState *machine = MACHINE(qdev_get_machine()); - ram_addr_t initial_mem = machine->ram_size; + uint64_t initial_mem = ram_size; int increment_size = 20; /* The storage increment size is a multiple of 1M and is a power of 2. @@ -339,15 +338,17 @@ static void sclp_memory_init(SCLPDevice *sclp) } sclp->increment_size = increment_size; - /* The core memory area needs to be aligned with the increment size. - * In effect, this can cause the user-specified memory size to be rounded - * down to align with the nearest increment boundary. */ + /* + * The core memory area needs to be aligned to the increment size. In + * case it's not aligned, bail out. + */ initial_mem = initial_mem >> increment_size << increment_size; - - machine->ram_size = initial_mem; - machine->maxram_size = initial_mem; - /* let's propagate the changed ram size into the global variable. */ - ram_size = initial_mem; + if (initial_mem != ram_size) { + error_report("RAM size not aligned to storage increments." + " Possible aligned RAM size: %" PRIu64 " MB", + initial_mem / MiB); + exit(1); + } } static void sclp_init(Object *obj)
Historically, we fixed up the RAM size (rounded it down), to fit into storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use memdev for RAM"), we no longer consider the fixed-up size when allcoating the RAM block - which will break migration. Let's simply drop that manual fixup code and let the user supply sane RAM sizes. This will bail out early when trying to migrate (and make an existing guest with e.g., 12345 MB non-migratable), but maybe we should have rejected such RAM sizes right from the beginning. As we no longer fixup maxram_size as well, make other users use ram_size instead. Keep using maxram_size when setting the maximum ram size in KVM, as that will come in handy in the future when supporting memory hotplug (in contrast, storage keys and storage attributes for hotplugged memory will have to be migrated per RAM block in the future). This fixes (or rather rejects early): 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the RAM block size changed. 2. Migrating upstream QEMU to upstream QEMU (e.g., with "-m 1235M"), as we receive storage attributes for memory we don't expect (as we fixed up ram_size and maxram_size). Fixes: 3a12fc61af5c ("390x/s390-virtio-ccw: use memdev for RAM") Reported-by: Lukáš Doktor <ldoktor@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> --- hw/s390x/s390-skeys.c | 4 +--- hw/s390x/s390-stattrib-kvm.c | 7 ++----- hw/s390x/sclp.c | 21 +++++++++++---------- 3 files changed, 14 insertions(+), 18 deletions(-)