diff mbox series

[v1] s390x: Reject unaligned RAM sizes

Message ID 20200327152930.66636-1-david@redhat.com
State New
Headers show
Series [v1] s390x: Reject unaligned RAM sizes | expand

Commit Message

David Hildenbrand March 27, 2020, 3:29 p.m. UTC
Historically, we fixed up the RAM size (rounded it down), to fit into
storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use
memdev for RAM"), we no longer consider the fixed-up size when
allcoating the RAM block - which will break migration.

Let's simply drop that manual fixup code and let the user supply sane
RAM sizes. This will bail out early when trying to migrate (and make
an existing guest with e.g., 12345 MB non-migratable), but maybe we
should have rejected such RAM sizes right from the beginning.

As we no longer fixup maxram_size as well, make other users use ram_size
instead. Keep using maxram_size when setting the maximum ram size in KVM,
as that will come in handy in the future when supporting memory hotplug
(in contrast, storage keys and storage attributes for hotplugged memory
 will have to be migrated per RAM block in the future).

This fixes (or rather rejects early):

1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the
   RAM block size changed.

2. Migrating upstream QEMU to upstream QEMU (e.g., with "-m 1235M"), as
   we receive storage attributes for memory we don't expect (as we fixed up
   ram_size and maxram_size).

Fixes: 3a12fc61af5c ("390x/s390-virtio-ccw: use memdev for RAM")
Reported-by: Lukáš Doktor <ldoktor@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/s390x/s390-skeys.c        |  4 +---
 hw/s390x/s390-stattrib-kvm.c |  7 ++-----
 hw/s390x/sclp.c              | 21 +++++++++++----------
 3 files changed, 14 insertions(+), 18 deletions(-)

Comments

Christian Borntraeger March 27, 2020, 3:34 p.m. UTC | #1
On 27.03.20 16:29, David Hildenbrand wrote:
> Historically, we fixed up the RAM size (rounded it down), to fit into
> storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use
> memdev for RAM"), we no longer consider the fixed-up size when
> allcoating the RAM block - which will break migration.
> 
> Let's simply drop that manual fixup code and let the user supply sane
> RAM sizes. This will bail out early when trying to migrate (and make
> an existing guest with e.g., 12345 MB non-migratable), but maybe we
> should have rejected such RAM sizes right from the beginning.
> 
> As we no longer fixup maxram_size as well, make other users use ram_size
> instead. Keep using maxram_size when setting the maximum ram size in KVM,
> as that will come in handy in the future when supporting memory hotplug
> (in contrast, storage keys and storage attributes for hotplugged memory
>  will have to be migrated per RAM block in the future).
> 
> This fixes (or rather rejects early):
> 
> 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the
>    RAM block size changed.

Not sure I like this variant. Instead of breaking migration (that was 
accidentially done by Igors changes) we now reject migration from older
QEMUs to 5.0. This is not going to help those that still have such guests
running and want to migrate. 

> 
> 2. Migrating upstream QEMU to upstream QEMU (e.g., with "-m 1235M"), as
>    we receive storage attributes for memory we don't expect (as we fixed up
>    ram_size and maxram_size).
> 
> Fixes: 3a12fc61af5c ("390x/s390-virtio-ccw: use memdev for RAM")
> Reported-by: Lukáš Doktor <ldoktor@redhat.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>
David Hildenbrand March 27, 2020, 4:01 p.m. UTC | #2
On 27.03.20 16:34, Christian Borntraeger wrote:
> 
> 
> On 27.03.20 16:29, David Hildenbrand wrote:
>> Historically, we fixed up the RAM size (rounded it down), to fit into
>> storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use
>> memdev for RAM"), we no longer consider the fixed-up size when
>> allcoating the RAM block - which will break migration.
>>
>> Let's simply drop that manual fixup code and let the user supply sane
>> RAM sizes. This will bail out early when trying to migrate (and make
>> an existing guest with e.g., 12345 MB non-migratable), but maybe we
>> should have rejected such RAM sizes right from the beginning.
>>
>> As we no longer fixup maxram_size as well, make other users use ram_size
>> instead. Keep using maxram_size when setting the maximum ram size in KVM,
>> as that will come in handy in the future when supporting memory hotplug
>> (in contrast, storage keys and storage attributes for hotplugged memory
>>  will have to be migrated per RAM block in the future).
>>
>> This fixes (or rather rejects early):
>>
>> 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the
>>    RAM block size changed.
> 
> Not sure I like this variant. Instead of breaking migration (that was 
> accidentially done by Igors changes) we now reject migration from older
> QEMUs to 5.0. This is not going to help those that still have such guests
> running and want to migrate. 

As Igor mentioned on another channel, you most probably can migrate an
older guest by starting it on the target with a fixed-up size.

E.g., migrate an old QEMU "-m 1235M" to a new QEMU "-m 1234M"

Not sure how many such weird-size VMs we actually do have in practice.
Christian Borntraeger March 27, 2020, 4:05 p.m. UTC | #3
On 27.03.20 17:01, David Hildenbrand wrote:
> On 27.03.20 16:34, Christian Borntraeger wrote:
>>
>>
>> On 27.03.20 16:29, David Hildenbrand wrote:
>>> Historically, we fixed up the RAM size (rounded it down), to fit into
>>> storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use
>>> memdev for RAM"), we no longer consider the fixed-up size when
>>> allcoating the RAM block - which will break migration.
>>>
>>> Let's simply drop that manual fixup code and let the user supply sane
>>> RAM sizes. This will bail out early when trying to migrate (and make
>>> an existing guest with e.g., 12345 MB non-migratable), but maybe we
>>> should have rejected such RAM sizes right from the beginning.
>>>
>>> As we no longer fixup maxram_size as well, make other users use ram_size
>>> instead. Keep using maxram_size when setting the maximum ram size in KVM,
>>> as that will come in handy in the future when supporting memory hotplug
>>> (in contrast, storage keys and storage attributes for hotplugged memory
>>>  will have to be migrated per RAM block in the future).
>>>
>>> This fixes (or rather rejects early):
>>>
>>> 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the
>>>    RAM block size changed.
>>
>> Not sure I like this variant. Instead of breaking migration (that was 
>> accidentially done by Igors changes) we now reject migration from older
>> QEMUs to 5.0. This is not going to help those that still have such guests
>> running and want to migrate. 
> 
> As Igor mentioned on another channel, you most probably can migrate an
> older guest by starting it on the target with a fixed-up size.
> 
> E.g., migrate an old QEMU "-m 1235M" to a new QEMU "-m 1234M"

Yes, that should probably work.
> 
> Not sure how many such weird-size VMs we actually do have in practice.

I am worried about some automated deployments where tooling has created
these sizes for dozens or hundreds of containers in VMS and so.
Igor Mammedov March 27, 2020, 4:46 p.m. UTC | #4
On Fri, 27 Mar 2020 17:05:34 +0100
Christian Borntraeger <borntraeger@de.ibm.com> wrote:

> On 27.03.20 17:01, David Hildenbrand wrote:
> > On 27.03.20 16:34, Christian Borntraeger wrote:  
> >>
> >>
> >> On 27.03.20 16:29, David Hildenbrand wrote:  
> >>> Historically, we fixed up the RAM size (rounded it down), to fit into
> >>> storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use
> >>> memdev for RAM"), we no longer consider the fixed-up size when
> >>> allcoating the RAM block - which will break migration.
> >>>
> >>> Let's simply drop that manual fixup code and let the user supply sane
> >>> RAM sizes. This will bail out early when trying to migrate (and make
> >>> an existing guest with e.g., 12345 MB non-migratable), but maybe we
> >>> should have rejected such RAM sizes right from the beginning.
> >>>
> >>> As we no longer fixup maxram_size as well, make other users use ram_size
> >>> instead. Keep using maxram_size when setting the maximum ram size in KVM,
> >>> as that will come in handy in the future when supporting memory hotplug
> >>> (in contrast, storage keys and storage attributes for hotplugged memory
> >>>  will have to be migrated per RAM block in the future).
> >>>
> >>> This fixes (or rather rejects early):
> >>>
> >>> 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the
> >>>    RAM block size changed.  
> >>
> >> Not sure I like this variant. Instead of breaking migration (that was 
> >> accidentially done by Igors changes) we now reject migration from older
> >> QEMUs to 5.0. This is not going to help those that still have such guests
> >> running and want to migrate.   
> > 
> > As Igor mentioned on another channel, you most probably can migrate an
> > older guest by starting it on the target with a fixed-up size.
> > 
> > E.g., migrate an old QEMU "-m 1235M" to a new QEMU "-m 1234M"  
> 
> Yes, that should probably work.
I'm in process of testing it.

> > Not sure how many such weird-size VMs we actually do have in practice.  
> 
> I am worried about some automated deployments where tooling has created
> these sizes for dozens or hundreds of containers in VMS and so.
Yep, it's possible but then that tooling/configs should be fixed to work with
new QEMU that validates user's input.
Igor Mammedov March 27, 2020, 4:48 p.m. UTC | #5
On Fri, 27 Mar 2020 16:29:30 +0100
David Hildenbrand <david@redhat.com> wrote:

> Historically, we fixed up the RAM size (rounded it down), to fit into
> storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use
> memdev for RAM"), we no longer consider the fixed-up size when
> allcoating the RAM block - which will break migration.
> 
> Let's simply drop that manual fixup code and let the user supply sane
> RAM sizes. This will bail out early when trying to migrate (and make
> an existing guest with e.g., 12345 MB non-migratable), but maybe we
> should have rejected such RAM sizes right from the beginning.
> 
> As we no longer fixup maxram_size as well, make other users use ram_size
> instead. Keep using maxram_size when setting the maximum ram size in KVM,
> as that will come in handy in the future when supporting memory hotplug
> (in contrast, storage keys and storage attributes for hotplugged memory
>  will have to be migrated per RAM block in the future).
> 
> This fixes (or rather rejects early):
> 
> 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the
>    RAM block size changed.
> 
> 2. Migrating upstream QEMU to upstream QEMU (e.g., with "-m 1235M"), as
>    we receive storage attributes for memory we don't expect (as we fixed up
>    ram_size and maxram_size).
> 
> Fixes: 3a12fc61af5c ("390x/s390-virtio-ccw: use memdev for RAM")
> Reported-by: Lukáš Doktor <ldoktor@redhat.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  hw/s390x/s390-skeys.c        |  4 +---
>  hw/s390x/s390-stattrib-kvm.c |  7 ++-----
>  hw/s390x/sclp.c              | 21 +++++++++++----------
>  3 files changed, 14 insertions(+), 18 deletions(-)
> 
> diff --git a/hw/s390x/s390-skeys.c b/hw/s390x/s390-skeys.c
> index 5da6e5292f..2545b1576b 100644
> --- a/hw/s390x/s390-skeys.c
> +++ b/hw/s390x/s390-skeys.c
> @@ -11,7 +11,6 @@
>  
>  #include "qemu/osdep.h"
>  #include "qemu/units.h"
> -#include "hw/boards.h"
>  #include "hw/s390x/storage-keys.h"
>  #include "qapi/error.h"
>  #include "qapi/qapi-commands-misc-target.h"
> @@ -174,9 +173,8 @@ out:
>  static void qemu_s390_skeys_init(Object *obj)
>  {
>      QEMUS390SKeysState *skeys = QEMU_S390_SKEYS(obj);
> -    MachineState *machine = MACHINE(qdev_get_machine());
>  
> -    skeys->key_count = machine->maxram_size / TARGET_PAGE_SIZE;
> +    skeys->key_count = ram_size / TARGET_PAGE_SIZE;

why are you dropping machine->foo all around and switching to global ram_size?
(I'd rather do other way around)

>      skeys->keydata = g_malloc0(skeys->key_count);
>  }
>  
> diff --git a/hw/s390x/s390-stattrib-kvm.c b/hw/s390x/s390-stattrib-kvm.c
> index c7e1f35524..ae88fbc32e 100644
> --- a/hw/s390x/s390-stattrib-kvm.c
> +++ b/hw/s390x/s390-stattrib-kvm.c
> @@ -10,7 +10,6 @@
>   */
>  
>  #include "qemu/osdep.h"
> -#include "hw/boards.h"
>  #include "migration/qemu-file.h"
>  #include "hw/s390x/storage-attributes.h"
>  #include "qemu/error-report.h"
> @@ -84,8 +83,7 @@ static int kvm_s390_stattrib_set_stattr(S390StAttribState *sa,
>                                          uint8_t *values)
>  {
>      KVMS390StAttribState *sas = KVM_S390_STATTRIB(sa);
> -    MachineState *machine = MACHINE(qdev_get_machine());
> -    unsigned long max = machine->maxram_size / TARGET_PAGE_SIZE;
> +    unsigned long max = ram_size / TARGET_PAGE_SIZE;
>  
>      if (start_gfn + count > max) {
>          error_report("Out of memory bounds when setting storage attributes");
> @@ -103,8 +101,7 @@ static int kvm_s390_stattrib_set_stattr(S390StAttribState *sa,
>  static void kvm_s390_stattrib_synchronize(S390StAttribState *sa)
>  {
>      KVMS390StAttribState *sas = KVM_S390_STATTRIB(sa);
> -    MachineState *machine = MACHINE(qdev_get_machine());
> -    unsigned long max = machine->maxram_size / TARGET_PAGE_SIZE;
> +    unsigned long max = ram_size / TARGET_PAGE_SIZE;
>      /* We do not need to reach the maximum buffer size allowed */
>      unsigned long cx, len = KVM_S390_SKEYS_MAX / 2;
>      int r;
> diff --git a/hw/s390x/sclp.c b/hw/s390x/sclp.c
> index af0bfbc2ec..6af471fb3f 100644
> --- a/hw/s390x/sclp.c
> +++ b/hw/s390x/sclp.c
> @@ -326,8 +326,7 @@ out:
>  
>  static void sclp_memory_init(SCLPDevice *sclp)
>  {
> -    MachineState *machine = MACHINE(qdev_get_machine());
> -    ram_addr_t initial_mem = machine->ram_size;
> +    uint64_t initial_mem = ram_size;
>      int increment_size = 20;
>  
>      /* The storage increment size is a multiple of 1M and is a power of 2.
> @@ -339,15 +338,17 @@ static void sclp_memory_init(SCLPDevice *sclp)
>      }
>      sclp->increment_size = increment_size;
>  
> -    /* The core memory area needs to be aligned with the increment size.
> -     * In effect, this can cause the user-specified memory size to be rounded
> -     * down to align with the nearest increment boundary. */
> +    /*
> +     * The core memory area needs to be aligned to the increment size. In
> +     * case it's not aligned, bail out.
> +     */
>      initial_mem = initial_mem >> increment_size << increment_size;
> -
> -    machine->ram_size = initial_mem;
> -    machine->maxram_size = initial_mem;
> -    /* let's propagate the changed ram size into the global variable. */
> -    ram_size = initial_mem;
> +    if (initial_mem != ram_size) {
> +        error_report("RAM size not aligned to storage increments."
> +                     " Possible aligned RAM size: %" PRIu64 " MB",
> +                     initial_mem / MiB);
> +        exit(1);
> +    }
>  }
>  
>  static void sclp_init(Object *obj)
David Hildenbrand March 27, 2020, 4:51 p.m. UTC | #6
On 27.03.20 17:48, Igor Mammedov wrote:
> On Fri, 27 Mar 2020 16:29:30 +0100
> David Hildenbrand <david@redhat.com> wrote:
> 
>> Historically, we fixed up the RAM size (rounded it down), to fit into
>> storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use
>> memdev for RAM"), we no longer consider the fixed-up size when
>> allcoating the RAM block - which will break migration.
>>
>> Let's simply drop that manual fixup code and let the user supply sane
>> RAM sizes. This will bail out early when trying to migrate (and make
>> an existing guest with e.g., 12345 MB non-migratable), but maybe we
>> should have rejected such RAM sizes right from the beginning.
>>
>> As we no longer fixup maxram_size as well, make other users use ram_size
>> instead. Keep using maxram_size when setting the maximum ram size in KVM,
>> as that will come in handy in the future when supporting memory hotplug
>> (in contrast, storage keys and storage attributes for hotplugged memory
>>  will have to be migrated per RAM block in the future).
>>
>> This fixes (or rather rejects early):
>>
>> 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the
>>    RAM block size changed.
>>
>> 2. Migrating upstream QEMU to upstream QEMU (e.g., with "-m 1235M"), as
>>    we receive storage attributes for memory we don't expect (as we fixed up
>>    ram_size and maxram_size).
>>
>> Fixes: 3a12fc61af5c ("390x/s390-virtio-ccw: use memdev for RAM")
>> Reported-by: Lukáš Doktor <ldoktor@redhat.com>
>> Cc: Igor Mammedov <imammedo@redhat.com>
>> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>>  hw/s390x/s390-skeys.c        |  4 +---
>>  hw/s390x/s390-stattrib-kvm.c |  7 ++-----
>>  hw/s390x/sclp.c              | 21 +++++++++++----------
>>  3 files changed, 14 insertions(+), 18 deletions(-)
>>
>> diff --git a/hw/s390x/s390-skeys.c b/hw/s390x/s390-skeys.c
>> index 5da6e5292f..2545b1576b 100644
>> --- a/hw/s390x/s390-skeys.c
>> +++ b/hw/s390x/s390-skeys.c
>> @@ -11,7 +11,6 @@
>>  
>>  #include "qemu/osdep.h"
>>  #include "qemu/units.h"
>> -#include "hw/boards.h"
>>  #include "hw/s390x/storage-keys.h"
>>  #include "qapi/error.h"
>>  #include "qapi/qapi-commands-misc-target.h"
>> @@ -174,9 +173,8 @@ out:
>>  static void qemu_s390_skeys_init(Object *obj)
>>  {
>>      QEMUS390SKeysState *skeys = QEMU_S390_SKEYS(obj);
>> -    MachineState *machine = MACHINE(qdev_get_machine());
>>  
>> -    skeys->key_count = machine->maxram_size / TARGET_PAGE_SIZE;
>> +    skeys->key_count = ram_size / TARGET_PAGE_SIZE;
> 
> why are you dropping machine->foo all around and switching to global ram_size?
> (I'd rather do other way around)

Not sure what the latest best practice is. I can also simply convert to
machine->ram_size if that's the right thing to do.
David Hildenbrand March 27, 2020, 4:53 p.m. UTC | #7
On 27.03.20 17:46, Igor Mammedov wrote:
> On Fri, 27 Mar 2020 17:05:34 +0100
> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> 
>> On 27.03.20 17:01, David Hildenbrand wrote:
>>> On 27.03.20 16:34, Christian Borntraeger wrote:  
>>>>
>>>>
>>>> On 27.03.20 16:29, David Hildenbrand wrote:  
>>>>> Historically, we fixed up the RAM size (rounded it down), to fit into
>>>>> storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use
>>>>> memdev for RAM"), we no longer consider the fixed-up size when
>>>>> allcoating the RAM block - which will break migration.
>>>>>
>>>>> Let's simply drop that manual fixup code and let the user supply sane
>>>>> RAM sizes. This will bail out early when trying to migrate (and make
>>>>> an existing guest with e.g., 12345 MB non-migratable), but maybe we
>>>>> should have rejected such RAM sizes right from the beginning.
>>>>>
>>>>> As we no longer fixup maxram_size as well, make other users use ram_size
>>>>> instead. Keep using maxram_size when setting the maximum ram size in KVM,
>>>>> as that will come in handy in the future when supporting memory hotplug
>>>>> (in contrast, storage keys and storage attributes for hotplugged memory
>>>>>  will have to be migrated per RAM block in the future).
>>>>>
>>>>> This fixes (or rather rejects early):
>>>>>
>>>>> 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the
>>>>>    RAM block size changed.  
>>>>
>>>> Not sure I like this variant. Instead of breaking migration (that was 
>>>> accidentially done by Igors changes) we now reject migration from older
>>>> QEMUs to 5.0. This is not going to help those that still have such guests
>>>> running and want to migrate.   
>>>
>>> As Igor mentioned on another channel, you most probably can migrate an
>>> older guest by starting it on the target with a fixed-up size.
>>>
>>> E.g., migrate an old QEMU "-m 1235M" to a new QEMU "-m 1234M"  
>>
>> Yes, that should probably work.
> I'm in process of testing it.
> 
>>> Not sure how many such weird-size VMs we actually do have in practice.  
>>
>> I am worried about some automated deployments where tooling has created
>> these sizes for dozens or hundreds of containers in VMS and so.

IIRC, e.g., Kata usually uses 2048MB. Not sure about others, but I'd be
surprised if it's not multiples of, say, 128MB.

> Yep, it's possible but then that tooling/configs should be fixed to work with
> new QEMU that validates user's input.
> 

Yeah, and mention it in the cover letter, +eventually a "fixup" table
(e.g., old_size < X, has to be aligned to Y).

One alternative is to have an early fixup hack in QEMU, that fixes up
the sizes as we did before (and eventually warns the user). Not sure if
we really want/need that.
Halil Pasic March 27, 2020, 6:16 p.m. UTC | #8
On Fri, 27 Mar 2020 17:46:20 +0100
Igor Mammedov <imammedo@redhat.com> wrote:

> On Fri, 27 Mar 2020 17:05:34 +0100
> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> 
> > On 27.03.20 17:01, David Hildenbrand wrote:
> > > On 27.03.20 16:34, Christian Borntraeger wrote:  
> > >>
> > >>
> > >> On 27.03.20 16:29, David Hildenbrand wrote:  
> > >>> Historically, we fixed up the RAM size (rounded it down), to fit into
> > >>> storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use
> > >>> memdev for RAM"), we no longer consider the fixed-up size when
> > >>> allcoating the RAM block - which will break migration.
> > >>>
> > >>> Let's simply drop that manual fixup code and let the user supply sane
> > >>> RAM sizes. This will bail out early when trying to migrate (and make
> > >>> an existing guest with e.g., 12345 MB non-migratable), but maybe we
> > >>> should have rejected such RAM sizes right from the beginning.
> > >>>
> > >>> As we no longer fixup maxram_size as well, make other users use ram_size
> > >>> instead. Keep using maxram_size when setting the maximum ram size in KVM,
> > >>> as that will come in handy in the future when supporting memory hotplug
> > >>> (in contrast, storage keys and storage attributes for hotplugged memory
> > >>>  will have to be migrated per RAM block in the future).
> > >>>
> > >>> This fixes (or rather rejects early):
> > >>>
> > >>> 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the
> > >>>    RAM block size changed.  
> > >>
> > >> Not sure I like this variant. Instead of breaking migration (that was 
> > >> accidentially done by Igors changes) we now reject migration from older
> > >> QEMUs to 5.0. This is not going to help those that still have such guests
> > >> running and want to migrate.   
> > > 
> > > As Igor mentioned on another channel, you most probably can migrate an
> > > older guest by starting it on the target with a fixed-up size.
> > > 
> > > E.g., migrate an old QEMU "-m 1235M" to a new QEMU "-m 1234M"  
> > 
> > Yes, that should probably work.
> I'm in process of testing it.
> 
> > > Not sure how many such weird-size VMs we actually do have in practice.  
> > 
> > I am worried about some automated deployments where tooling has created
> > these sizes for dozens or hundreds of containers in VMS and so.
> Yep, it's possible but then that tooling/configs should be fixed to work with
> new QEMU that validates user's input.
> 

@David: I'm a little confused. Is this fix about adding user input
validation, or is it about changing what valid inputs are?

I don't see this alignment requirement documented, so my guess is the
latter. And then, I'm not sure I'm sold on this.

Regards,
Halil
David Hildenbrand March 27, 2020, 6:25 p.m. UTC | #9
On 27.03.20 19:16, Halil Pasic wrote:
> On Fri, 27 Mar 2020 17:46:20 +0100
> Igor Mammedov <imammedo@redhat.com> wrote:
> 
>> On Fri, 27 Mar 2020 17:05:34 +0100
>> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
>>
>>> On 27.03.20 17:01, David Hildenbrand wrote:
>>>> On 27.03.20 16:34, Christian Borntraeger wrote:  
>>>>>
>>>>>
>>>>> On 27.03.20 16:29, David Hildenbrand wrote:  
>>>>>> Historically, we fixed up the RAM size (rounded it down), to fit into
>>>>>> storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use
>>>>>> memdev for RAM"), we no longer consider the fixed-up size when
>>>>>> allcoating the RAM block - which will break migration.
>>>>>>
>>>>>> Let's simply drop that manual fixup code and let the user supply sane
>>>>>> RAM sizes. This will bail out early when trying to migrate (and make
>>>>>> an existing guest with e.g., 12345 MB non-migratable), but maybe we
>>>>>> should have rejected such RAM sizes right from the beginning.
>>>>>>
>>>>>> As we no longer fixup maxram_size as well, make other users use ram_size
>>>>>> instead. Keep using maxram_size when setting the maximum ram size in KVM,
>>>>>> as that will come in handy in the future when supporting memory hotplug
>>>>>> (in contrast, storage keys and storage attributes for hotplugged memory
>>>>>>  will have to be migrated per RAM block in the future).
>>>>>>
>>>>>> This fixes (or rather rejects early):
>>>>>>
>>>>>> 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the
>>>>>>    RAM block size changed.  
>>>>>
>>>>> Not sure I like this variant. Instead of breaking migration (that was 
>>>>> accidentially done by Igors changes) we now reject migration from older
>>>>> QEMUs to 5.0. This is not going to help those that still have such guests
>>>>> running and want to migrate.   
>>>>
>>>> As Igor mentioned on another channel, you most probably can migrate an
>>>> older guest by starting it on the target with a fixed-up size.
>>>>
>>>> E.g., migrate an old QEMU "-m 1235M" to a new QEMU "-m 1234M"  
>>>
>>> Yes, that should probably work.
>> I'm in process of testing it.
>>
>>>> Not sure how many such weird-size VMs we actually do have in practice.  
>>>
>>> I am worried about some automated deployments where tooling has created
>>> these sizes for dozens or hundreds of containers in VMS and so.
>> Yep, it's possible but then that tooling/configs should be fixed to work with
>> new QEMU that validates user's input.
>>
> 
> @David: I'm a little confused. Is this fix about adding user input
> validation, or is it about changing what valid inputs are?

We used to silently fixup the user input instead of bailing out. So it's
rather the latter. It was never the right choice to silently fixup
instead of just bailing out. We never should have done that.

> 
> I don't see this alignment requirement documented, so my guess is the
> latter. And then, I'm not sure I'm sold on this.

Well, we can add documentation with the next release. It's something to
end up in the release notes.

If somebody specified "-m 1235", he always received "1234 MB" and never
"1235 MB". You could also call that a BUG, which should have bailed out
right from the start (there wasn't even a warning).
Igor Mammedov March 27, 2020, 9:38 p.m. UTC | #10
On Fri, 27 Mar 2020 17:51:23 +0100
David Hildenbrand <david@redhat.com> wrote:

> On 27.03.20 17:48, Igor Mammedov wrote:
> > On Fri, 27 Mar 2020 16:29:30 +0100
> > David Hildenbrand <david@redhat.com> wrote:
> >   
> >> Historically, we fixed up the RAM size (rounded it down), to fit into
> >> storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use
> >> memdev for RAM"), we no longer consider the fixed-up size when
> >> allcoating the RAM block - which will break migration.
> >>
> >> Let's simply drop that manual fixup code and let the user supply sane
> >> RAM sizes. This will bail out early when trying to migrate (and make
> >> an existing guest with e.g., 12345 MB non-migratable), but maybe we
> >> should have rejected such RAM sizes right from the beginning.
> >>
> >> As we no longer fixup maxram_size as well, make other users use ram_size
> >> instead. Keep using maxram_size when setting the maximum ram size in KVM,
> >> as that will come in handy in the future when supporting memory hotplug
> >> (in contrast, storage keys and storage attributes for hotplugged memory
> >>  will have to be migrated per RAM block in the future).
> >>
> >> This fixes (or rather rejects early):
> >>
> >> 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the
> >>    RAM block size changed.
> >>
> >> 2. Migrating upstream QEMU to upstream QEMU (e.g., with "-m 1235M"), as
> >>    we receive storage attributes for memory we don't expect (as we fixed up
> >>    ram_size and maxram_size).
> >>
> >> Fixes: 3a12fc61af5c ("390x/s390-virtio-ccw: use memdev for RAM")
> >> Reported-by: Lukáš Doktor <ldoktor@redhat.com>
> >> Cc: Igor Mammedov <imammedo@redhat.com>
> >> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> >> Signed-off-by: David Hildenbrand <david@redhat.com>
> >> ---
> >>  hw/s390x/s390-skeys.c        |  4 +---
> >>  hw/s390x/s390-stattrib-kvm.c |  7 ++-----
> >>  hw/s390x/sclp.c              | 21 +++++++++++----------
> >>  3 files changed, 14 insertions(+), 18 deletions(-)
> >>
> >> diff --git a/hw/s390x/s390-skeys.c b/hw/s390x/s390-skeys.c
> >> index 5da6e5292f..2545b1576b 100644
> >> --- a/hw/s390x/s390-skeys.c
> >> +++ b/hw/s390x/s390-skeys.c
> >> @@ -11,7 +11,6 @@
> >>  
> >>  #include "qemu/osdep.h"
> >>  #include "qemu/units.h"
> >> -#include "hw/boards.h"
> >>  #include "hw/s390x/storage-keys.h"
> >>  #include "qapi/error.h"
> >>  #include "qapi/qapi-commands-misc-target.h"
> >> @@ -174,9 +173,8 @@ out:
> >>  static void qemu_s390_skeys_init(Object *obj)
> >>  {
> >>      QEMUS390SKeysState *skeys = QEMU_S390_SKEYS(obj);
> >> -    MachineState *machine = MACHINE(qdev_get_machine());
> >>  
> >> -    skeys->key_count = machine->maxram_size / TARGET_PAGE_SIZE;
> >> +    skeys->key_count = ram_size / TARGET_PAGE_SIZE;  
> > 
> > why are you dropping machine->foo all around and switching to global ram_size?
> > (I'd rather do other way around)  
> 
> Not sure what the latest best practice is. I can also simply convert to
> machine->ram_size if that's the right thing to do.
My understanding of it was not to use globals if possible.
(I planned on removing global ram_size an leave only machine->ram_size
but that a bit tricky since things tend to explode once a global touched,
so it needs some more thought/patience)
Igor Mammedov March 27, 2020, 10:13 p.m. UTC | #11
On Fri, 27 Mar 2020 17:53:39 +0100
David Hildenbrand <david@redhat.com> wrote:

> On 27.03.20 17:46, Igor Mammedov wrote:
> > On Fri, 27 Mar 2020 17:05:34 +0100
> > Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> >   
> >> On 27.03.20 17:01, David Hildenbrand wrote:  
> >>> On 27.03.20 16:34, Christian Borntraeger wrote:    
> >>>>
> >>>>
> >>>> On 27.03.20 16:29, David Hildenbrand wrote:    
> >>>>> Historically, we fixed up the RAM size (rounded it down), to fit into
> >>>>> storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use
> >>>>> memdev for RAM"), we no longer consider the fixed-up size when
> >>>>> allcoating the RAM block - which will break migration.
> >>>>>
> >>>>> Let's simply drop that manual fixup code and let the user supply sane
> >>>>> RAM sizes. This will bail out early when trying to migrate (and make
> >>>>> an existing guest with e.g., 12345 MB non-migratable), but maybe we
> >>>>> should have rejected such RAM sizes right from the beginning.
> >>>>>
> >>>>> As we no longer fixup maxram_size as well, make other users use ram_size
> >>>>> instead. Keep using maxram_size when setting the maximum ram size in KVM,
> >>>>> as that will come in handy in the future when supporting memory hotplug
> >>>>> (in contrast, storage keys and storage attributes for hotplugged memory
> >>>>>  will have to be migrated per RAM block in the future).
> >>>>>
> >>>>> This fixes (or rather rejects early):
> >>>>>
> >>>>> 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the
> >>>>>    RAM block size changed.    
> >>>>
> >>>> Not sure I like this variant. Instead of breaking migration (that was 
> >>>> accidentially done by Igors changes) we now reject migration from older
> >>>> QEMUs to 5.0. This is not going to help those that still have such guests
> >>>> running and want to migrate.     
> >>>
> >>> As Igor mentioned on another channel, you most probably can migrate an
> >>> older guest by starting it on the target with a fixed-up size.
> >>>
> >>> E.g., migrate an old QEMU "-m 1235M" to a new QEMU "-m 1234M"    
> >>
> >> Yes, that should probably work.  
> > I'm in process of testing it.

it works

> >   
> >>> Not sure how many such weird-size VMs we actually do have in practice.    
> >>
> >> I am worried about some automated deployments where tooling has created
> >> these sizes for dozens or hundreds of containers in VMS and so.  
> 
> IIRC, e.g., Kata usually uses 2048MB. Not sure about others, but I'd be
> surprised if it's not multiples of, say, 128MB.
> 
> > Yep, it's possible but then that tooling/configs should be fixed to work with
> > new QEMU that validates user's input.
> >   
> 
> Yeah, and mention it in the cover letter, +eventually a "fixup" table
> (e.g., old_size < X, has to be aligned to Y).
> 
> One alternative is to have an early fixup hack in QEMU, that fixes up
> the sizes as we did before (and eventually warns the user). Not sure if
> we really want/need that.
That would require at least a callback at machine level, 
also practice shows warnings are of no use.

If someone insist on using wrong size with new QEMU, one can write a wrapper
script to workaround the issue (if they don't wish to fix configs/tooling).
I don't like keeping hacks to fix user mistakes in QEMU and on top of that
adding infrastructure for that (QEMU is already too much complicated).
Since in this case it break migration only partially, it's sufficient
to print error message that suggest correct size to use (like it's been done
for other boards, s390 is the last remaining that is doing ram_size fixups).
That way user could fix config and restart migration.
Christian Borntraeger March 31, 2020, 11:17 a.m. UTC | #12
On 27.03.20 23:13, Igor Mammedov wrote:
> On Fri, 27 Mar 2020 17:53:39 +0100
> David Hildenbrand <david@redhat.com> wrote:
> 
>> On 27.03.20 17:46, Igor Mammedov wrote:
>>> On Fri, 27 Mar 2020 17:05:34 +0100
>>> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
>>>   
>>>> On 27.03.20 17:01, David Hildenbrand wrote:  
>>>>> On 27.03.20 16:34, Christian Borntraeger wrote:    
>>>>>>
>>>>>>
>>>>>> On 27.03.20 16:29, David Hildenbrand wrote:    
>>>>>>> Historically, we fixed up the RAM size (rounded it down), to fit into
>>>>>>> storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use
>>>>>>> memdev for RAM"), we no longer consider the fixed-up size when
>>>>>>> allcoating the RAM block - which will break migration.
>>>>>>>
>>>>>>> Let's simply drop that manual fixup code and let the user supply sane
>>>>>>> RAM sizes. This will bail out early when trying to migrate (and make
>>>>>>> an existing guest with e.g., 12345 MB non-migratable), but maybe we
>>>>>>> should have rejected such RAM sizes right from the beginning.
>>>>>>>
>>>>>>> As we no longer fixup maxram_size as well, make other users use ram_size
>>>>>>> instead. Keep using maxram_size when setting the maximum ram size in KVM,
>>>>>>> as that will come in handy in the future when supporting memory hotplug
>>>>>>> (in contrast, storage keys and storage attributes for hotplugged memory
>>>>>>>  will have to be migrated per RAM block in the future).
>>>>>>>
>>>>>>> This fixes (or rather rejects early):
>>>>>>>
>>>>>>> 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the
>>>>>>>    RAM block size changed.    
>>>>>>
>>>>>> Not sure I like this variant. Instead of breaking migration (that was 
>>>>>> accidentially done by Igors changes) we now reject migration from older
>>>>>> QEMUs to 5.0. This is not going to help those that still have such guests
>>>>>> running and want to migrate.     
>>>>>
>>>>> As Igor mentioned on another channel, you most probably can migrate an
>>>>> older guest by starting it on the target with a fixed-up size.
>>>>>
>>>>> E.g., migrate an old QEMU "-m 1235M" to a new QEMU "-m 1234M"    
>>>>
>>>> Yes, that should probably work.  
>>> I'm in process of testing it.
> 
> it works
> 
>>>   
>>>>> Not sure how many such weird-size VMs we actually do have in practice.    
>>>>
>>>> I am worried about some automated deployments where tooling has created
>>>> these sizes for dozens or hundreds of containers in VMS and so.  
>>
>> IIRC, e.g., Kata usually uses 2048MB. Not sure about others, but I'd be
>> surprised if it's not multiples of, say, 128MB.
>>
>>> Yep, it's possible but then that tooling/configs should be fixed to work with
>>> new QEMU that validates user's input.
>>>   
>>
>> Yeah, and mention it in the cover letter, +eventually a "fixup" table
>> (e.g., old_size < X, has to be aligned to Y).
>>
>> One alternative is to have an early fixup hack in QEMU, that fixes up
>> the sizes as we did before (and eventually warns the user). Not sure if
>> we really want/need that.
> That would require at least a callback at machine level, 
> also practice shows warnings are of no use.

I would strongly prefer to not break setups that used to work and do an early fixup
(a machine callback). I will have a look.

> 
> If someone insist on using wrong size with new QEMU, one can write a wrapper
> script to workaround the issue (if they don't wish to fix configs/tooling).
> I don't like keeping hacks to fix user mistakes in QEMU and on top of that
> adding infrastructure for that (QEMU is already too much complicated).
> Since in this case it break migration only partially, it's sufficient
> to print error message that suggest correct size to use (like it's been done
> for other boards, s390 is the last remaining that is doing ram_size fixups).
> That way user could fix config and restart migration.
>
Igor Mammedov March 31, 2020, 3:33 p.m. UTC | #13
On Tue, 31 Mar 2020 13:17:38 +0200
Christian Borntraeger <borntraeger@de.ibm.com> wrote:

> On 27.03.20 23:13, Igor Mammedov wrote:
> > On Fri, 27 Mar 2020 17:53:39 +0100
> > David Hildenbrand <david@redhat.com> wrote:
> >   
> >> On 27.03.20 17:46, Igor Mammedov wrote:  
> >>> On Fri, 27 Mar 2020 17:05:34 +0100
> >>> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> >>>     
> >>>> On 27.03.20 17:01, David Hildenbrand wrote:    
> >>>>> On 27.03.20 16:34, Christian Borntraeger wrote:      
> >>>>>>
> >>>>>>
> >>>>>> On 27.03.20 16:29, David Hildenbrand wrote:      
> >>>>>>> Historically, we fixed up the RAM size (rounded it down), to fit into
> >>>>>>> storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use
> >>>>>>> memdev for RAM"), we no longer consider the fixed-up size when
> >>>>>>> allcoating the RAM block - which will break migration.
> >>>>>>>
> >>>>>>> Let's simply drop that manual fixup code and let the user supply sane
> >>>>>>> RAM sizes. This will bail out early when trying to migrate (and make
> >>>>>>> an existing guest with e.g., 12345 MB non-migratable), but maybe we
> >>>>>>> should have rejected such RAM sizes right from the beginning.
> >>>>>>>
> >>>>>>> As we no longer fixup maxram_size as well, make other users use ram_size
> >>>>>>> instead. Keep using maxram_size when setting the maximum ram size in KVM,
> >>>>>>> as that will come in handy in the future when supporting memory hotplug
> >>>>>>> (in contrast, storage keys and storage attributes for hotplugged memory
> >>>>>>>  will have to be migrated per RAM block in the future).
> >>>>>>>
> >>>>>>> This fixes (or rather rejects early):
> >>>>>>>
> >>>>>>> 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the
> >>>>>>>    RAM block size changed.      
> >>>>>>
> >>>>>> Not sure I like this variant. Instead of breaking migration (that was 
> >>>>>> accidentially done by Igors changes) we now reject migration from older
> >>>>>> QEMUs to 5.0. This is not going to help those that still have such guests
> >>>>>> running and want to migrate.       
> >>>>>
> >>>>> As Igor mentioned on another channel, you most probably can migrate an
> >>>>> older guest by starting it on the target with a fixed-up size.
> >>>>>
> >>>>> E.g., migrate an old QEMU "-m 1235M" to a new QEMU "-m 1234M"      
> >>>>
> >>>> Yes, that should probably work.    
> >>> I'm in process of testing it.  
> > 
> > it works
> >   
> >>>     
> >>>>> Not sure how many such weird-size VMs we actually do have in practice.      
> >>>>
> >>>> I am worried about some automated deployments where tooling has created
> >>>> these sizes for dozens or hundreds of containers in VMS and so.    
> >>
> >> IIRC, e.g., Kata usually uses 2048MB. Not sure about others, but I'd be
> >> surprised if it's not multiples of, say, 128MB.
> >>  
> >>> Yep, it's possible but then that tooling/configs should be fixed to work with
> >>> new QEMU that validates user's input.
> >>>     
> >>
> >> Yeah, and mention it in the cover letter, +eventually a "fixup" table
> >> (e.g., old_size < X, has to be aligned to Y).
> >>
> >> One alternative is to have an early fixup hack in QEMU, that fixes up
> >> the sizes as we did before (and eventually warns the user). Not sure if
> >> we really want/need that.  
> > That would require at least a callback at machine level, 
> > also practice shows warnings are of no use.  
> 
> I would strongly prefer to not break setups that used to work and do an early fixup
> (a machine callback). I will have a look.

If it were breaking migration stream or guest ABI,
I'd agree with you but it isn't.

So in this case, it's a bug that qemu wasn't checking
size for alignment and making workaround to keep the bug
around looks wrong to me.
It is fixable on user side (one has to fix VM's config),
so it should be fixed there and not in QEMU.
And any automatic tooling that generates invalid size
should be fixed as well. 
(I think it's not QEMU's job to mask users errors and
doing something that user not asked for)


> > If someone insist on using wrong size with new QEMU, one can write a wrapper
> > script to workaround the issue (if they don't wish to fix configs/tooling).
> > I don't like keeping hacks to fix user mistakes in QEMU and on top of that
> > adding infrastructure for that (QEMU is already too much complicated).
> > Since in this case it break migration only partially, it's sufficient
> > to print error message that suggest correct size to use (like it's been done
> > for other boards, s390 is the last remaining that is doing ram_size fixups).
> > That way user could fix config and restart migration.
> >   
> 
>
David Hildenbrand March 31, 2020, 3:39 p.m. UTC | #14
On 31.03.20 17:33, Igor Mammedov wrote:
> On Tue, 31 Mar 2020 13:17:38 +0200
> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> 
>> On 27.03.20 23:13, Igor Mammedov wrote:
>>> On Fri, 27 Mar 2020 17:53:39 +0100
>>> David Hildenbrand <david@redhat.com> wrote:
>>>   
>>>> On 27.03.20 17:46, Igor Mammedov wrote:  
>>>>> On Fri, 27 Mar 2020 17:05:34 +0100
>>>>> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
>>>>>     
>>>>>> On 27.03.20 17:01, David Hildenbrand wrote:    
>>>>>>> On 27.03.20 16:34, Christian Borntraeger wrote:      
>>>>>>>>
>>>>>>>>
>>>>>>>> On 27.03.20 16:29, David Hildenbrand wrote:      
>>>>>>>>> Historically, we fixed up the RAM size (rounded it down), to fit into
>>>>>>>>> storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use
>>>>>>>>> memdev for RAM"), we no longer consider the fixed-up size when
>>>>>>>>> allcoating the RAM block - which will break migration.
>>>>>>>>>
>>>>>>>>> Let's simply drop that manual fixup code and let the user supply sane
>>>>>>>>> RAM sizes. This will bail out early when trying to migrate (and make
>>>>>>>>> an existing guest with e.g., 12345 MB non-migratable), but maybe we
>>>>>>>>> should have rejected such RAM sizes right from the beginning.
>>>>>>>>>
>>>>>>>>> As we no longer fixup maxram_size as well, make other users use ram_size
>>>>>>>>> instead. Keep using maxram_size when setting the maximum ram size in KVM,
>>>>>>>>> as that will come in handy in the future when supporting memory hotplug
>>>>>>>>> (in contrast, storage keys and storage attributes for hotplugged memory
>>>>>>>>>  will have to be migrated per RAM block in the future).
>>>>>>>>>
>>>>>>>>> This fixes (or rather rejects early):
>>>>>>>>>
>>>>>>>>> 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the
>>>>>>>>>    RAM block size changed.      
>>>>>>>>
>>>>>>>> Not sure I like this variant. Instead of breaking migration (that was 
>>>>>>>> accidentially done by Igors changes) we now reject migration from older
>>>>>>>> QEMUs to 5.0. This is not going to help those that still have such guests
>>>>>>>> running and want to migrate.       
>>>>>>>
>>>>>>> As Igor mentioned on another channel, you most probably can migrate an
>>>>>>> older guest by starting it on the target with a fixed-up size.
>>>>>>>
>>>>>>> E.g., migrate an old QEMU "-m 1235M" to a new QEMU "-m 1234M"      
>>>>>>
>>>>>> Yes, that should probably work.    
>>>>> I'm in process of testing it.  
>>>
>>> it works
>>>   
>>>>>     
>>>>>>> Not sure how many such weird-size VMs we actually do have in practice.      
>>>>>>
>>>>>> I am worried about some automated deployments where tooling has created
>>>>>> these sizes for dozens or hundreds of containers in VMS and so.    
>>>>
>>>> IIRC, e.g., Kata usually uses 2048MB. Not sure about others, but I'd be
>>>> surprised if it's not multiples of, say, 128MB.
>>>>  
>>>>> Yep, it's possible but then that tooling/configs should be fixed to work with
>>>>> new QEMU that validates user's input.
>>>>>     
>>>>
>>>> Yeah, and mention it in the cover letter, +eventually a "fixup" table
>>>> (e.g., old_size < X, has to be aligned to Y).
>>>>
>>>> One alternative is to have an early fixup hack in QEMU, that fixes up
>>>> the sizes as we did before (and eventually warns the user). Not sure if
>>>> we really want/need that.  
>>> That would require at least a callback at machine level, 
>>> also practice shows warnings are of no use.  
>>
>> I would strongly prefer to not break setups that used to work and do an early fixup
>> (a machine callback). I will have a look.
> 
> If it were breaking migration stream or guest ABI,
> I'd agree with you but it isn't.
> 
> So in this case, it's a bug that qemu wasn't checking
> size for alignment and making workaround to keep the bug
> around looks wrong to me.
> It is fixable on user side (one has to fix VM's config),
> so it should be fixed there and not in QEMU.
> And any automatic tooling that generates invalid size
> should be fixed as well. 
> (I think it's not QEMU's job to mask users errors and
> doing something that user not asked for)

Dave G mentioned, that e.g., via Cockpit it can be fairly easy to
produce weird RAM sizes (via a slider). So I guess this "issue" could be
more widespread than I initially thought.

I agree that it's a BUG that we didn't bail out but instead decided to
fix it up.
Christian Borntraeger March 31, 2020, 3:39 p.m. UTC | #15
On 31.03.20 17:33, Igor Mammedov wrote:
> On Tue, 31 Mar 2020 13:17:38 +0200
> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> 
>> On 27.03.20 23:13, Igor Mammedov wrote:
>>> On Fri, 27 Mar 2020 17:53:39 +0100
>>> David Hildenbrand <david@redhat.com> wrote:
>>>   
>>>> On 27.03.20 17:46, Igor Mammedov wrote:  
>>>>> On Fri, 27 Mar 2020 17:05:34 +0100
>>>>> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
>>>>>     
>>>>>> On 27.03.20 17:01, David Hildenbrand wrote:    
>>>>>>> On 27.03.20 16:34, Christian Borntraeger wrote:      
>>>>>>>>
>>>>>>>>
>>>>>>>> On 27.03.20 16:29, David Hildenbrand wrote:      
>>>>>>>>> Historically, we fixed up the RAM size (rounded it down), to fit into
>>>>>>>>> storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use
>>>>>>>>> memdev for RAM"), we no longer consider the fixed-up size when
>>>>>>>>> allcoating the RAM block - which will break migration.
>>>>>>>>>
>>>>>>>>> Let's simply drop that manual fixup code and let the user supply sane
>>>>>>>>> RAM sizes. This will bail out early when trying to migrate (and make
>>>>>>>>> an existing guest with e.g., 12345 MB non-migratable), but maybe we
>>>>>>>>> should have rejected such RAM sizes right from the beginning.
>>>>>>>>>
>>>>>>>>> As we no longer fixup maxram_size as well, make other users use ram_size
>>>>>>>>> instead. Keep using maxram_size when setting the maximum ram size in KVM,
>>>>>>>>> as that will come in handy in the future when supporting memory hotplug
>>>>>>>>> (in contrast, storage keys and storage attributes for hotplugged memory
>>>>>>>>>  will have to be migrated per RAM block in the future).
>>>>>>>>>
>>>>>>>>> This fixes (or rather rejects early):
>>>>>>>>>
>>>>>>>>> 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the
>>>>>>>>>    RAM block size changed.      
>>>>>>>>
>>>>>>>> Not sure I like this variant. Instead of breaking migration (that was 
>>>>>>>> accidentially done by Igors changes) we now reject migration from older
>>>>>>>> QEMUs to 5.0. This is not going to help those that still have such guests
>>>>>>>> running and want to migrate.       
>>>>>>>
>>>>>>> As Igor mentioned on another channel, you most probably can migrate an
>>>>>>> older guest by starting it on the target with a fixed-up size.
>>>>>>>
>>>>>>> E.g., migrate an old QEMU "-m 1235M" to a new QEMU "-m 1234M"      
>>>>>>
>>>>>> Yes, that should probably work.    
>>>>> I'm in process of testing it.  
>>>
>>> it works
>>>   
>>>>>     
>>>>>>> Not sure how many such weird-size VMs we actually do have in practice.      
>>>>>>
>>>>>> I am worried about some automated deployments where tooling has created
>>>>>> these sizes for dozens or hundreds of containers in VMS and so.    
>>>>
>>>> IIRC, e.g., Kata usually uses 2048MB. Not sure about others, but I'd be
>>>> surprised if it's not multiples of, say, 128MB.
>>>>  
>>>>> Yep, it's possible but then that tooling/configs should be fixed to work with
>>>>> new QEMU that validates user's input.
>>>>>     
>>>>
>>>> Yeah, and mention it in the cover letter, +eventually a "fixup" table
>>>> (e.g., old_size < X, has to be aligned to Y).
>>>>
>>>> One alternative is to have an early fixup hack in QEMU, that fixes up
>>>> the sizes as we did before (and eventually warns the user). Not sure if
>>>> we really want/need that.  
>>> That would require at least a callback at machine level, 
>>> also practice shows warnings are of no use.  
>>
>> I would strongly prefer to not break setups that used to work and do an early fixup
>> (a machine callback). I will have a look.
> 
> If it were breaking migration stream or guest ABI,
> I'd agree with you but it isn't.
> 
> So in this case, it's a bug that qemu wasn't checking
> size for alignment and making workaround to keep the bug
> around looks wrong to me.

QEMU did check for alignment and fixed it up. So it was an intentional decision (maybe a
bad one). Your change broke working setups and I really do not like that.
See my new patch for proper handling for machine types 5.0 and later + compat
handling for older machines.
Christian Borntraeger March 31, 2020, 3:42 p.m. UTC | #16
On 31.03.20 17:39, David Hildenbrand wrote:
> On 31.03.20 17:33, Igor Mammedov wrote:
>> On Tue, 31 Mar 2020 13:17:38 +0200
>> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
>>
>>> On 27.03.20 23:13, Igor Mammedov wrote:
>>>> On Fri, 27 Mar 2020 17:53:39 +0100
>>>> David Hildenbrand <david@redhat.com> wrote:
>>>>   
>>>>> On 27.03.20 17:46, Igor Mammedov wrote:  
>>>>>> On Fri, 27 Mar 2020 17:05:34 +0100
>>>>>> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
>>>>>>     
>>>>>>> On 27.03.20 17:01, David Hildenbrand wrote:    
>>>>>>>> On 27.03.20 16:34, Christian Borntraeger wrote:      
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 27.03.20 16:29, David Hildenbrand wrote:      
>>>>>>>>>> Historically, we fixed up the RAM size (rounded it down), to fit into
>>>>>>>>>> storage increments. Since commit 3a12fc61af5c ("390x/s390-virtio-ccw: use
>>>>>>>>>> memdev for RAM"), we no longer consider the fixed-up size when
>>>>>>>>>> allcoating the RAM block - which will break migration.
>>>>>>>>>>
>>>>>>>>>> Let's simply drop that manual fixup code and let the user supply sane
>>>>>>>>>> RAM sizes. This will bail out early when trying to migrate (and make
>>>>>>>>>> an existing guest with e.g., 12345 MB non-migratable), but maybe we
>>>>>>>>>> should have rejected such RAM sizes right from the beginning.
>>>>>>>>>>
>>>>>>>>>> As we no longer fixup maxram_size as well, make other users use ram_size
>>>>>>>>>> instead. Keep using maxram_size when setting the maximum ram size in KVM,
>>>>>>>>>> as that will come in handy in the future when supporting memory hotplug
>>>>>>>>>> (in contrast, storage keys and storage attributes for hotplugged memory
>>>>>>>>>>  will have to be migrated per RAM block in the future).
>>>>>>>>>>
>>>>>>>>>> This fixes (or rather rejects early):
>>>>>>>>>>
>>>>>>>>>> 1. Migrating older QEMU to upstream QEMU (e.g., with "-m 1235M"), as the
>>>>>>>>>>    RAM block size changed.      
>>>>>>>>>
>>>>>>>>> Not sure I like this variant. Instead of breaking migration (that was 
>>>>>>>>> accidentially done by Igors changes) we now reject migration from older
>>>>>>>>> QEMUs to 5.0. This is not going to help those that still have such guests
>>>>>>>>> running and want to migrate.       
>>>>>>>>
>>>>>>>> As Igor mentioned on another channel, you most probably can migrate an
>>>>>>>> older guest by starting it on the target with a fixed-up size.
>>>>>>>>
>>>>>>>> E.g., migrate an old QEMU "-m 1235M" to a new QEMU "-m 1234M"      
>>>>>>>
>>>>>>> Yes, that should probably work.    
>>>>>> I'm in process of testing it.  
>>>>
>>>> it works
>>>>   
>>>>>>     
>>>>>>>> Not sure how many such weird-size VMs we actually do have in practice.      
>>>>>>>
>>>>>>> I am worried about some automated deployments where tooling has created
>>>>>>> these sizes for dozens or hundreds of containers in VMS and so.    
>>>>>
>>>>> IIRC, e.g., Kata usually uses 2048MB. Not sure about others, but I'd be
>>>>> surprised if it's not multiples of, say, 128MB.
>>>>>  
>>>>>> Yep, it's possible but then that tooling/configs should be fixed to work with
>>>>>> new QEMU that validates user's input.
>>>>>>     
>>>>>
>>>>> Yeah, and mention it in the cover letter, +eventually a "fixup" table
>>>>> (e.g., old_size < X, has to be aligned to Y).
>>>>>
>>>>> One alternative is to have an early fixup hack in QEMU, that fixes up
>>>>> the sizes as we did before (and eventually warns the user). Not sure if
>>>>> we really want/need that.  
>>>> That would require at least a callback at machine level, 
>>>> also practice shows warnings are of no use.  
>>>
>>> I would strongly prefer to not break setups that used to work and do an early fixup
>>> (a machine callback). I will have a look.
>>
>> If it were breaking migration stream or guest ABI,
>> I'd agree with you but it isn't.
>>
>> So in this case, it's a bug that qemu wasn't checking
>> size for alignment and making workaround to keep the bug
>> around looks wrong to me.
>> It is fixable on user side (one has to fix VM's config),
>> so it should be fixed there and not in QEMU.
>> And any automatic tooling that generates invalid size
>> should be fixed as well. 
>> (I think it's not QEMU's job to mask users errors and
>> doing something that user not asked for)
> 
> Dave G mentioned, that e.g., via Cockpit it can be fairly easy to
> produce weird RAM sizes (via a slider). So I guess this "issue" could be
> more widespread than I initially thought.
> 
> I agree that it's a BUG that we didn't bail out but instead decided to
> fix it up.

It was a decision back than. Likely a wrong one.
Now insisting on "lets breaking user setups because it violates my feeling
of perfection" is not the right answer though.
diff mbox series

Patch

diff --git a/hw/s390x/s390-skeys.c b/hw/s390x/s390-skeys.c
index 5da6e5292f..2545b1576b 100644
--- a/hw/s390x/s390-skeys.c
+++ b/hw/s390x/s390-skeys.c
@@ -11,7 +11,6 @@ 
 
 #include "qemu/osdep.h"
 #include "qemu/units.h"
-#include "hw/boards.h"
 #include "hw/s390x/storage-keys.h"
 #include "qapi/error.h"
 #include "qapi/qapi-commands-misc-target.h"
@@ -174,9 +173,8 @@  out:
 static void qemu_s390_skeys_init(Object *obj)
 {
     QEMUS390SKeysState *skeys = QEMU_S390_SKEYS(obj);
-    MachineState *machine = MACHINE(qdev_get_machine());
 
-    skeys->key_count = machine->maxram_size / TARGET_PAGE_SIZE;
+    skeys->key_count = ram_size / TARGET_PAGE_SIZE;
     skeys->keydata = g_malloc0(skeys->key_count);
 }
 
diff --git a/hw/s390x/s390-stattrib-kvm.c b/hw/s390x/s390-stattrib-kvm.c
index c7e1f35524..ae88fbc32e 100644
--- a/hw/s390x/s390-stattrib-kvm.c
+++ b/hw/s390x/s390-stattrib-kvm.c
@@ -10,7 +10,6 @@ 
  */
 
 #include "qemu/osdep.h"
-#include "hw/boards.h"
 #include "migration/qemu-file.h"
 #include "hw/s390x/storage-attributes.h"
 #include "qemu/error-report.h"
@@ -84,8 +83,7 @@  static int kvm_s390_stattrib_set_stattr(S390StAttribState *sa,
                                         uint8_t *values)
 {
     KVMS390StAttribState *sas = KVM_S390_STATTRIB(sa);
-    MachineState *machine = MACHINE(qdev_get_machine());
-    unsigned long max = machine->maxram_size / TARGET_PAGE_SIZE;
+    unsigned long max = ram_size / TARGET_PAGE_SIZE;
 
     if (start_gfn + count > max) {
         error_report("Out of memory bounds when setting storage attributes");
@@ -103,8 +101,7 @@  static int kvm_s390_stattrib_set_stattr(S390StAttribState *sa,
 static void kvm_s390_stattrib_synchronize(S390StAttribState *sa)
 {
     KVMS390StAttribState *sas = KVM_S390_STATTRIB(sa);
-    MachineState *machine = MACHINE(qdev_get_machine());
-    unsigned long max = machine->maxram_size / TARGET_PAGE_SIZE;
+    unsigned long max = ram_size / TARGET_PAGE_SIZE;
     /* We do not need to reach the maximum buffer size allowed */
     unsigned long cx, len = KVM_S390_SKEYS_MAX / 2;
     int r;
diff --git a/hw/s390x/sclp.c b/hw/s390x/sclp.c
index af0bfbc2ec..6af471fb3f 100644
--- a/hw/s390x/sclp.c
+++ b/hw/s390x/sclp.c
@@ -326,8 +326,7 @@  out:
 
 static void sclp_memory_init(SCLPDevice *sclp)
 {
-    MachineState *machine = MACHINE(qdev_get_machine());
-    ram_addr_t initial_mem = machine->ram_size;
+    uint64_t initial_mem = ram_size;
     int increment_size = 20;
 
     /* The storage increment size is a multiple of 1M and is a power of 2.
@@ -339,15 +338,17 @@  static void sclp_memory_init(SCLPDevice *sclp)
     }
     sclp->increment_size = increment_size;
 
-    /* The core memory area needs to be aligned with the increment size.
-     * In effect, this can cause the user-specified memory size to be rounded
-     * down to align with the nearest increment boundary. */
+    /*
+     * The core memory area needs to be aligned to the increment size. In
+     * case it's not aligned, bail out.
+     */
     initial_mem = initial_mem >> increment_size << increment_size;
-
-    machine->ram_size = initial_mem;
-    machine->maxram_size = initial_mem;
-    /* let's propagate the changed ram size into the global variable. */
-    ram_size = initial_mem;
+    if (initial_mem != ram_size) {
+        error_report("RAM size not aligned to storage increments."
+                     " Possible aligned RAM size: %" PRIu64 " MB",
+                     initial_mem / MiB);
+        exit(1);
+    }
 }
 
 static void sclp_init(Object *obj)