mbox series

[v2,00/13] migrate/ram: Fix resizing RAM blocks while migrating

Message ID 20200221164204.105570-1-david@redhat.com
Headers show
Series migrate/ram: Fix resizing RAM blocks while migrating | expand

Message

David Hildenbrand Feb. 21, 2020, 4:41 p.m. UTC
This is the follow up of
    "[PATCH RFC] memory: Don't allow to resize RAM while migrating" [1]

This series contains some (slightly modified) patches also contained in:
    "[PATCH v2 fixed 00/16] Ram blocks with resizable anonymous allocations
     under POSIX" [2]
That series will be based on this series. The last patch (#13) in this
series could be moved to the other series, but I decided to include it in
here for now (similar context).

I realized that resizing RAM blocks while the guest is being migrated
(precopy: resize while still running on the source, postcopy: resize
 while already running on the target) is buggy. In case of precopy, we
can simply cancel migration. Postcopy handling is more involved. Resizing
can currently happen during a guest reboot, triggered by ACPI rebuilds.

Along with the fixes, some cleanups.

[1] https://lkml.kernel.org/r/20200213172016.196609-1-david@redhat.com
[2] https://lkml.kernel.org/r/20200212134254.11073-1-david@redhat.com

I was now able to actually test resizing while migrating. I am using the
prototype of virtio-mem to test (which also makes use of resizable
allocations). Things I was able to reproduce:
- Resize while still running on the migration source. Migration is canceled
-- Test case for "migraton/ram: Handle RAM block resizes during precopy"
- Resize (grow+shrink) on the migration target during postcopy migration
  (when syncing RAM blocks), while not yet running on the target
-- Test case for "migration/ram: Discard new RAM when growing RAM blocks
   and the VM is stopped", and overall RAM size synchronization. Seems to
   work just fine.
- Resize (grow+shrink) on the migration tagret during postcopy migration
  while already running on the target.
-- Test case for "migration/ram: Handle RAM block resizes during postcopy"
-- Test case for "migration/ram: Tolerate partially changed mappings in
   postcopy code" - I can see that -ENOENT is actually triggered and that
   migration succeeds. Migration seems to work just fine.

In addition I run avocado-vt migration tests + usual QEMU checks.

v1 -> v2:
- "util: vfio-helpers: Factor out and fix processing of existing ram
   blocks"
-- Stringify error
- "migraton/ram: Handle RAM block resizes during precopy"
-- Simplified check if we're migrating on the source
- "exec: Relax range check in ram_block_discard_range()"
-- Added to make discard during resizes actually work
- "migration/ram: Discard new RAM when growing RAM blocks after
   ram_postcopy_incoming_init()"
-- Better checks if in the right postcopy mode.
-- Better patch subject/description/comments
- "migration/ram: Handle RAM block resizes during postcopy"
-- Better comments
-- Adapt to changed postcopy checks
- "migrate/ram: Get rid of "place_source" in ram_load_postcopy()"
-- Dropped, as broken
- "migration/ram: Tolerate partially changed mappings in postcopy code"
-- Better comment / description. Clarify that no implicit wakeup will
   happen
-- Warn on EINVAL (older kernels)
-- Wake up any waiter explicitly


David Hildenbrand (13):
  util: vfio-helpers: Factor out and fix processing of existing ram
    blocks
  stubs/ram-block: Remove stubs that are no longer needed
  numa: Teach ram block notifiers about resizeable ram blocks
  numa: Make all callbacks of ram block notifiers optional
  migration/ram: Handle RAM block resizes during precopy
  exec: Relax range check in ram_block_discard_range()
  migration/ram: Discard RAM when growing RAM blocks after
    ram_postcopy_incoming_init()
  migration/ram: Simplify host page handling in ram_load_postcopy()
  migration/ram: Consolidate variable reset after placement in
    ram_load_postcopy()
  migration/ram: Handle RAM block resizes during postcopy
  migration/multifd: Print used_length of memory block
  migration/ram: Use offset_in_ramblock() in range checks
  migration/ram: Tolerate partially changed mappings in postcopy code

 exec.c                     |  27 +++++--
 hw/core/numa.c             |  41 +++++++++--
 hw/i386/xen/xen-mapcache.c |   7 +-
 include/exec/cpu-common.h  |   1 +
 include/exec/memory.h      |  10 +--
 include/exec/ramblock.h    |  10 +++
 include/exec/ramlist.h     |  13 ++--
 migration/migration.c      |   9 ++-
 migration/migration.h      |   1 +
 migration/multifd.c        |   2 +-
 migration/postcopy-ram.c   |  52 +++++++++++++-
 migration/ram.c            | 144 ++++++++++++++++++++++++++++---------
 stubs/ram-block.c          |  20 ------
 target/i386/hax-mem.c      |   5 +-
 target/i386/sev.c          |  18 ++---
 util/vfio-helpers.c        |  41 ++++-------
 16 files changed, 283 insertions(+), 118 deletions(-)

Comments

Peter Xu Feb. 21, 2020, 6:04 p.m. UTC | #1
On Fri, Feb 21, 2020 at 05:41:51PM +0100, David Hildenbrand wrote:
> I was now able to actually test resizing while migrating. I am using the
> prototype of virtio-mem to test (which also makes use of resizable
> allocations). Things I was able to reproduce:

The test cases cover quite a lot.  Thanks for doing that.

> - Resize while still running on the migration source. Migration is canceled
> -- Test case for "migraton/ram: Handle RAM block resizes during precopy"

> - Resize (grow+shrink) on the migration target during postcopy migration
>   (when syncing RAM blocks), while not yet running on the target
> -- Test case for "migration/ram: Discard new RAM when growing RAM blocks
>    and the VM is stopped", and overall RAM size synchronization. Seems to
>    work just fine.

This won't be able to trigger without virtio-mem, right?

And I'm also curious on how to test this even with virtio-mem.  Is
that a QMP command to extend/shrink virtio-mem?

> - Resize (grow+shrink) on the migration tagret during postcopy migration
>   while already running on the target.
> -- Test case for "migration/ram: Handle RAM block resizes during postcopy"
> -- Test case for "migration/ram: Tolerate partially changed mappings in
>    postcopy code" - I can see that -ENOENT is actually triggered and that
>    migration succeeds. Migration seems to work just fine.

Thanks,
David Hildenbrand Feb. 24, 2020, 9:09 a.m. UTC | #2
On 21.02.20 19:04, Peter Xu wrote:
> On Fri, Feb 21, 2020 at 05:41:51PM +0100, David Hildenbrand wrote:
>> I was now able to actually test resizing while migrating. I am using the
>> prototype of virtio-mem to test (which also makes use of resizable
>> allocations). Things I was able to reproduce:
> 
> The test cases cover quite a lot.  Thanks for doing that.
> 
>> - Resize while still running on the migration source. Migration is canceled
>> -- Test case for "migraton/ram: Handle RAM block resizes during precopy"
> 
>> - Resize (grow+shrink) on the migration target during postcopy migration
>>   (when syncing RAM blocks), while not yet running on the target
>> -- Test case for "migration/ram: Discard new RAM when growing RAM blocks
>>    and the VM is stopped", and overall RAM size synchronization. Seems to
>>    work just fine.
> 
> This won't be able to trigger without virtio-mem, right?

AFAIK all cases can also be triggered without virtio-mem (not just that
easily :) ). This case would be "RAM block is bigger on source than on
destination.".

> 
> And I'm also curious on how to test this even with virtio-mem.  Is
> that a QMP command to extend/shrink virtio-mem?

Currently, there is a single qom property that can be modifed via
QMP/HMP - "requested-size". With resizable resizable memory backends,
increasing the requested size will also implicitly grow the RAM block.
Shrinking the requested size will currently result in shrinking the RAM
block on the next reboot.

So, to trigger growing of a RAM block (assuming requested-size was
smaller before, e.g., 1000M)

echo "qom-set vm1 requested-size 6000M" | sudo nc -U $MON

To trigger shrinking (assuming requested-size was bigger before)

echo "qom-set vm1 requested-size 100M" | sudo nc -U $MON
echo 'system_reset' | sudo nc -U $MON


Placing these at the right spots during a migration allows to test this
very reliably.
Peter Xu Feb. 24, 2020, 5:45 p.m. UTC | #3
On Mon, Feb 24, 2020 at 10:09:19AM +0100, David Hildenbrand wrote:
> On 21.02.20 19:04, Peter Xu wrote:
> > On Fri, Feb 21, 2020 at 05:41:51PM +0100, David Hildenbrand wrote:
> >> I was now able to actually test resizing while migrating. I am using the
> >> prototype of virtio-mem to test (which also makes use of resizable
> >> allocations). Things I was able to reproduce:
> > 
> > The test cases cover quite a lot.  Thanks for doing that.
> > 
> >> - Resize while still running on the migration source. Migration is canceled
> >> -- Test case for "migraton/ram: Handle RAM block resizes during precopy"
> > 
> >> - Resize (grow+shrink) on the migration target during postcopy migration
> >>   (when syncing RAM blocks), while not yet running on the target
> >> -- Test case for "migration/ram: Discard new RAM when growing RAM blocks
> >>    and the VM is stopped", and overall RAM size synchronization. Seems to
> >>    work just fine.
> > 
> > This won't be able to trigger without virtio-mem, right?
> 
> AFAIK all cases can also be triggered without virtio-mem (not just that
> easily :) ). This case would be "RAM block is bigger on source than on
> destination.".
> 
> > 
> > And I'm also curious on how to test this even with virtio-mem.  Is
> > that a QMP command to extend/shrink virtio-mem?
> 
> Currently, there is a single qom property that can be modifed via
> QMP/HMP - "requested-size". With resizable resizable memory backends,
> increasing the requested size will also implicitly grow the RAM block.
> Shrinking the requested size will currently result in shrinking the RAM
> block on the next reboot.
> 
> So, to trigger growing of a RAM block (assuming requested-size was
> smaller before, e.g., 1000M)
> 
> echo "qom-set vm1 requested-size 6000M" | sudo nc -U $MON
> 
> To trigger shrinking (assuming requested-size was bigger before)
> 
> echo "qom-set vm1 requested-size 100M" | sudo nc -U $MON
> echo 'system_reset' | sudo nc -U $MON
> 
> 
> Placing these at the right spots during a migration allows to test this
> very reliably.

I see, thanks for the context.  The question was majorly about when
you say "during postcopy migration (when syncing RAM blocks), while
not yet running on the target" - it's not easy to do so imho, because:

  - it's a very short transition period between precopy and postcopy,
    so I was curious about how you made sure that the grow/shrink
    happened exactly during that period

  - during the period, IIUC it was still in the main thread, which
    means logically QEMU should not be able to respond to any QMP/HMP
    command at all...  So even if you send a command, I think it'll
    only be executed later after the transition completes

  - this I'm not sure, but ... even for virtio-mem, the resizing can
    only happen after guest ack it, right?  During the precopy to
    postcopy transition period, the VM is stopped, AFAICT, so
    logically we can't trigger resizing during the transition

So it's really a question/matter of whether we still even need to
consider that transition period for resizing event for postcopy.
Maybe we don't even need to.

Thanks,
David Hildenbrand Feb. 24, 2020, 6:44 p.m. UTC | #4
On 24.02.20 18:45, Peter Xu wrote:
> On Mon, Feb 24, 2020 at 10:09:19AM +0100, David Hildenbrand wrote:
>> On 21.02.20 19:04, Peter Xu wrote:
>>> On Fri, Feb 21, 2020 at 05:41:51PM +0100, David Hildenbrand wrote:
>>>> I was now able to actually test resizing while migrating. I am using the
>>>> prototype of virtio-mem to test (which also makes use of resizable
>>>> allocations). Things I was able to reproduce:
>>>
>>> The test cases cover quite a lot.  Thanks for doing that.
>>>
>>>> - Resize while still running on the migration source. Migration is canceled
>>>> -- Test case for "migraton/ram: Handle RAM block resizes during precopy"
>>>
>>>> - Resize (grow+shrink) on the migration target during postcopy migration
>>>>   (when syncing RAM blocks), while not yet running on the target
>>>> -- Test case for "migration/ram: Discard new RAM when growing RAM blocks
>>>>    and the VM is stopped", and overall RAM size synchronization. Seems to
>>>>    work just fine.
>>>
>>> This won't be able to trigger without virtio-mem, right?
>>
>> AFAIK all cases can also be triggered without virtio-mem (not just that
>> easily :) ). This case would be "RAM block is bigger on source than on
>> destination.".
>>
>>>
>>> And I'm also curious on how to test this even with virtio-mem.  Is
>>> that a QMP command to extend/shrink virtio-mem?
>>
>> Currently, there is a single qom property that can be modifed via
>> QMP/HMP - "requested-size". With resizable resizable memory backends,
>> increasing the requested size will also implicitly grow the RAM block.
>> Shrinking the requested size will currently result in shrinking the RAM
>> block on the next reboot.
>>
>> So, to trigger growing of a RAM block (assuming requested-size was
>> smaller before, e.g., 1000M)
>>
>> echo "qom-set vm1 requested-size 6000M" | sudo nc -U $MON
>>
>> To trigger shrinking (assuming requested-size was bigger before)
>>
>> echo "qom-set vm1 requested-size 100M" | sudo nc -U $MON
>> echo 'system_reset' | sudo nc -U $MON
>>
>>
>> Placing these at the right spots during a migration allows to test this
>> very reliably.
> 
> I see, thanks for the context.  The question was majorly about when
> you say "during postcopy migration (when syncing RAM blocks), while
> not yet running on the target" - it's not easy to do so imho, because:

This case is very easy to trigger, even with acpi. Simply have a ram
block on the source be bigger than one on the target. The sync code
(migration/ram.c:qemu_ram_resize()) will perform the resize during
precopy. Postcopy misses to discard the additional memory.

Maybe my description was confusing. But this really just triggers when

- Postcopy is advised and discards memory on all ram blocks
- Precopy grows the RAM block when syncing the RAM block sizes with the
source

Postcopy misses to discard the new RAM.

> 
>   - it's a very short transition period between precopy and postcopy,
>     so I was curious about how you made sure that the grow/shrink
>     happened exactly during that period
> 
>   - during the period, IIUC it was still in the main thread, which
>     means logically QEMU should not be able to respond to any QMP/HMP
>     command at all...  So even if you send a command, I think it'll
>     only be executed later after the transition completes
> 
>   - this I'm not sure, but ... even for virtio-mem, the resizing can
>     only happen after guest ack it, right?  During the precopy to
>     postcopy transition period, the VM is stopped, AFAICT, so
>     logically we can't trigger resizing during the transition
> 
> So it's really a question/matter of whether we still even need to
> consider that transition period for resizing event for postcopy.
> Maybe we don't even need to.

It's synchronous and not a race. So it does matter very much :)
David Hildenbrand Feb. 24, 2020, 6:59 p.m. UTC | #5
On 24.02.20 19:44, David Hildenbrand wrote:
> On 24.02.20 18:45, Peter Xu wrote:
>> On Mon, Feb 24, 2020 at 10:09:19AM +0100, David Hildenbrand wrote:
>>> On 21.02.20 19:04, Peter Xu wrote:
>>>> On Fri, Feb 21, 2020 at 05:41:51PM +0100, David Hildenbrand wrote:
>>>>> I was now able to actually test resizing while migrating. I am using the
>>>>> prototype of virtio-mem to test (which also makes use of resizable
>>>>> allocations). Things I was able to reproduce:
>>>>
>>>> The test cases cover quite a lot.  Thanks for doing that.
>>>>
>>>>> - Resize while still running on the migration source. Migration is canceled
>>>>> -- Test case for "migraton/ram: Handle RAM block resizes during precopy"
>>>>
>>>>> - Resize (grow+shrink) on the migration target during postcopy migration
>>>>>   (when syncing RAM blocks), while not yet running on the target
>>>>> -- Test case for "migration/ram: Discard new RAM when growing RAM blocks
>>>>>    and the VM is stopped", and overall RAM size synchronization. Seems to
>>>>>    work just fine.
>>>>
>>>> This won't be able to trigger without virtio-mem, right?
>>>
>>> AFAIK all cases can also be triggered without virtio-mem (not just that
>>> easily :) ). This case would be "RAM block is bigger on source than on
>>> destination.".
>>>
>>>>
>>>> And I'm also curious on how to test this even with virtio-mem.  Is
>>>> that a QMP command to extend/shrink virtio-mem?
>>>
>>> Currently, there is a single qom property that can be modifed via
>>> QMP/HMP - "requested-size". With resizable resizable memory backends,
>>> increasing the requested size will also implicitly grow the RAM block.
>>> Shrinking the requested size will currently result in shrinking the RAM
>>> block on the next reboot.
>>>
>>> So, to trigger growing of a RAM block (assuming requested-size was
>>> smaller before, e.g., 1000M)
>>>
>>> echo "qom-set vm1 requested-size 6000M" | sudo nc -U $MON
>>>
>>> To trigger shrinking (assuming requested-size was bigger before)
>>>
>>> echo "qom-set vm1 requested-size 100M" | sudo nc -U $MON
>>> echo 'system_reset' | sudo nc -U $MON
>>>
>>>
>>> Placing these at the right spots during a migration allows to test this
>>> very reliably.
>>
>> I see, thanks for the context.  The question was majorly about when
>> you say "during postcopy migration (when syncing RAM blocks), while
>> not yet running on the target" - it's not easy to do so imho, because:
> 
> This case is very easy to trigger, even with acpi. Simply have a ram
> block on the source be bigger than one on the target. The sync code
> (migration/ram.c:qemu_ram_resize()) will perform the resize during
> precopy. Postcopy misses to discard the additional memory.
> 
> Maybe my description was confusing. But this really just triggers when
> 
> - Postcopy is advised and discards memory on all ram blocks
> - Precopy grows the RAM block when syncing the RAM block sizes with the
> source
> 
> Postcopy misses to discard the new RAM.
> 
>>
>>   - it's a very short transition period between precopy and postcopy,
>>     so I was curious about how you made sure that the grow/shrink
>>     happened exactly during that period
>>
>>   - during the period, IIUC it was still in the main thread, which
>>     means logically QEMU should not be able to respond to any QMP/HMP
>>     command at all...  So even if you send a command, I think it'll
>>     only be executed later after the transition completes
>>
>>   - this I'm not sure, but ... even for virtio-mem, the resizing can
>>     only happen after guest ack it, right?  During the precopy to
>>     postcopy transition period, the VM is stopped, AFAICT, so
>>     logically we can't trigger resizing during the transition

Regarding that question: Resizes will happen without guest interaction
(e.g., during a reboot, or when increasing the requested size). In the
future, there are theoretical plans to have resizes that can be
triggered by guest interaction/request to some extend as well.
Peter Xu Feb. 24, 2020, 7:18 p.m. UTC | #6
On Mon, Feb 24, 2020 at 07:59:10PM +0100, David Hildenbrand wrote:
> On 24.02.20 19:44, David Hildenbrand wrote:
> > On 24.02.20 18:45, Peter Xu wrote:
> >> On Mon, Feb 24, 2020 at 10:09:19AM +0100, David Hildenbrand wrote:
> >>> On 21.02.20 19:04, Peter Xu wrote:
> >>>> On Fri, Feb 21, 2020 at 05:41:51PM +0100, David Hildenbrand wrote:
> >>>>> I was now able to actually test resizing while migrating. I am using the
> >>>>> prototype of virtio-mem to test (which also makes use of resizable
> >>>>> allocations). Things I was able to reproduce:
> >>>>
> >>>> The test cases cover quite a lot.  Thanks for doing that.
> >>>>
> >>>>> - Resize while still running on the migration source. Migration is canceled
> >>>>> -- Test case for "migraton/ram: Handle RAM block resizes during precopy"
> >>>>
> >>>>> - Resize (grow+shrink) on the migration target during postcopy migration
> >>>>>   (when syncing RAM blocks), while not yet running on the target
> >>>>> -- Test case for "migration/ram: Discard new RAM when growing RAM blocks
> >>>>>    and the VM is stopped", and overall RAM size synchronization. Seems to
> >>>>>    work just fine.
> >>>>
> >>>> This won't be able to trigger without virtio-mem, right?
> >>>
> >>> AFAIK all cases can also be triggered without virtio-mem (not just that
> >>> easily :) ). This case would be "RAM block is bigger on source than on
> >>> destination.".
> >>>
> >>>>
> >>>> And I'm also curious on how to test this even with virtio-mem.  Is
> >>>> that a QMP command to extend/shrink virtio-mem?
> >>>
> >>> Currently, there is a single qom property that can be modifed via
> >>> QMP/HMP - "requested-size". With resizable resizable memory backends,
> >>> increasing the requested size will also implicitly grow the RAM block.
> >>> Shrinking the requested size will currently result in shrinking the RAM
> >>> block on the next reboot.
> >>>
> >>> So, to trigger growing of a RAM block (assuming requested-size was
> >>> smaller before, e.g., 1000M)
> >>>
> >>> echo "qom-set vm1 requested-size 6000M" | sudo nc -U $MON
> >>>
> >>> To trigger shrinking (assuming requested-size was bigger before)
> >>>
> >>> echo "qom-set vm1 requested-size 100M" | sudo nc -U $MON
> >>> echo 'system_reset' | sudo nc -U $MON
> >>>
> >>>
> >>> Placing these at the right spots during a migration allows to test this
> >>> very reliably.
> >>
> >> I see, thanks for the context.  The question was majorly about when
> >> you say "during postcopy migration (when syncing RAM blocks), while
> >> not yet running on the target" - it's not easy to do so imho, because:
> > 
> > This case is very easy to trigger, even with acpi. Simply have a ram
> > block on the source be bigger than one on the target. The sync code
> > (migration/ram.c:qemu_ram_resize()) will perform the resize during
> > precopy. Postcopy misses to discard the additional memory.

But when resizing happens during precopy, we should cancel this
migration directly?  Hmm?...

> > 
> > Maybe my description was confusing. But this really just triggers when
> > 
> > - Postcopy is advised and discards memory on all ram blocks
> > - Precopy grows the RAM block when syncing the RAM block sizes with the
> > source
> > 
> > Postcopy misses to discard the new RAM.
> > 
> >>
> >>   - it's a very short transition period between precopy and postcopy,
> >>     so I was curious about how you made sure that the grow/shrink
> >>     happened exactly during that period
> >>
> >>   - during the period, IIUC it was still in the main thread, which
> >>     means logically QEMU should not be able to respond to any QMP/HMP
> >>     command at all...  So even if you send a command, I think it'll
> >>     only be executed later after the transition completes
> >>
> >>   - this I'm not sure, but ... even for virtio-mem, the resizing can
> >>     only happen after guest ack it, right?  During the precopy to
> >>     postcopy transition period, the VM is stopped, AFAICT, so
> >>     logically we can't trigger resizing during the transition
> 
> Regarding that question: Resizes will happen without guest interaction
> (e.g., during a reboot, or when increasing the requested size). In the
> future, there are theoretical plans to have resizes that can be
> triggered by guest interaction/request to some extend as well.

I see.  I was thinking about shrinking case which should probably need
an acknowledgement from the guest, but yes increasing seems to be fine
even without it.  Thanks,
David Hildenbrand Feb. 24, 2020, 7:34 p.m. UTC | #7
> Am 24.02.2020 um 20:19 schrieb Peter Xu <peterx@redhat.com>:
> 
> On Mon, Feb 24, 2020 at 07:59:10PM +0100, David Hildenbrand wrote:
>>> On 24.02.20 19:44, David Hildenbrand wrote:
>>> On 24.02.20 18:45, Peter Xu wrote:
>>>> On Mon, Feb 24, 2020 at 10:09:19AM +0100, David Hildenbrand wrote:
>>>>> On 21.02.20 19:04, Peter Xu wrote:
>>>>>> On Fri, Feb 21, 2020 at 05:41:51PM +0100, David Hildenbrand wrote:
>>>>>>> I was now able to actually test resizing while migrating. I am using the
>>>>>>> prototype of virtio-mem to test (which also makes use of resizable
>>>>>>> allocations). Things I was able to reproduce:
>>>>>> 
>>>>>> The test cases cover quite a lot.  Thanks for doing that.
>>>>>> 
>>>>>>> - Resize while still running on the migration source. Migration is canceled
>>>>>>> -- Test case for "migraton/ram: Handle RAM block resizes during precopy"
>>>>>> 
>>>>>>> - Resize (grow+shrink) on the migration target during postcopy migration
>>>>>>>  (when syncing RAM blocks), while not yet running on the target
>>>>>>> -- Test case for "migration/ram: Discard new RAM when growing RAM blocks
>>>>>>>   and the VM is stopped", and overall RAM size synchronization. Seems to
>>>>>>>   work just fine.
>>>>>> 
>>>>>> This won't be able to trigger without virtio-mem, right?
>>>>> 
>>>>> AFAIK all cases can also be triggered without virtio-mem (not just that
>>>>> easily :) ). This case would be "RAM block is bigger on source than on
>>>>> destination.".
>>>>> 
>>>>>> 
>>>>>> And I'm also curious on how to test this even with virtio-mem.  Is
>>>>>> that a QMP command to extend/shrink virtio-mem?
>>>>> 
>>>>> Currently, there is a single qom property that can be modifed via
>>>>> QMP/HMP - "requested-size". With resizable resizable memory backends,
>>>>> increasing the requested size will also implicitly grow the RAM block.
>>>>> Shrinking the requested size will currently result in shrinking the RAM
>>>>> block on the next reboot.
>>>>> 
>>>>> So, to trigger growing of a RAM block (assuming requested-size was
>>>>> smaller before, e.g., 1000M)
>>>>> 
>>>>> echo "qom-set vm1 requested-size 6000M" | sudo nc -U $MON
>>>>> 
>>>>> To trigger shrinking (assuming requested-size was bigger before)
>>>>> 
>>>>> echo "qom-set vm1 requested-size 100M" | sudo nc -U $MON
>>>>> echo 'system_reset' | sudo nc -U $MON
>>>>> 
>>>>> 
>>>>> Placing these at the right spots during a migration allows to test this
>>>>> very reliably.
>>>> 
>>>> I see, thanks for the context.  The question was majorly about when
>>>> you say "during postcopy migration (when syncing RAM blocks), while
>>>> not yet running on the target" - it's not easy to do so imho, because:
>>> 
>>> This case is very easy to trigger, even with acpi. Simply have a ram
>>> block on the source be bigger than one on the target. The sync code
>>> (migration/ram.c:qemu_ram_resize()) will perform the resize during
>>> precopy. Postcopy misses to discard the additional memory.
> 
> But when resizing happens during precopy, we should cancel this
> migration directly?  Hmm?...

?

We are talking about the migration target, not the source. Please have a look at the RAM block size sync code I mentioned. That‘s probably faster than me having to explain it (and obviously failing to do so :) ).
Peter Xu Feb. 24, 2020, 8:04 p.m. UTC | #8
On Mon, Feb 24, 2020 at 08:34:16PM +0100, David Hildenbrand wrote:
> 
> 
> > Am 24.02.2020 um 20:19 schrieb Peter Xu <peterx@redhat.com>:
> > 
> > On Mon, Feb 24, 2020 at 07:59:10PM +0100, David Hildenbrand wrote:
> >>> On 24.02.20 19:44, David Hildenbrand wrote:
> >>> On 24.02.20 18:45, Peter Xu wrote:
> >>>> On Mon, Feb 24, 2020 at 10:09:19AM +0100, David Hildenbrand wrote:
> >>>>> On 21.02.20 19:04, Peter Xu wrote:
> >>>>>> On Fri, Feb 21, 2020 at 05:41:51PM +0100, David Hildenbrand wrote:
> >>>>>>> I was now able to actually test resizing while migrating. I am using the
> >>>>>>> prototype of virtio-mem to test (which also makes use of resizable
> >>>>>>> allocations). Things I was able to reproduce:
> >>>>>> 
> >>>>>> The test cases cover quite a lot.  Thanks for doing that.
> >>>>>> 
> >>>>>>> - Resize while still running on the migration source. Migration is canceled
> >>>>>>> -- Test case for "migraton/ram: Handle RAM block resizes during precopy"
> >>>>>> 
> >>>>>>> - Resize (grow+shrink) on the migration target during postcopy migration

[2]

> >>>>>>>  (when syncing RAM blocks), while not yet running on the target
> >>>>>>> -- Test case for "migration/ram: Discard new RAM when growing RAM blocks
> >>>>>>>   and the VM is stopped", and overall RAM size synchronization. Seems to
> >>>>>>>   work just fine.
> >>>>>> 
> >>>>>> This won't be able to trigger without virtio-mem, right?
> >>>>> 
> >>>>> AFAIK all cases can also be triggered without virtio-mem (not just that
> >>>>> easily :) ). This case would be "RAM block is bigger on source than on
> >>>>> destination.".
> >>>>> 
> >>>>>> 
> >>>>>> And I'm also curious on how to test this even with virtio-mem.  Is
> >>>>>> that a QMP command to extend/shrink virtio-mem?
> >>>>> 
> >>>>> Currently, there is a single qom property that can be modifed via
> >>>>> QMP/HMP - "requested-size". With resizable resizable memory backends,
> >>>>> increasing the requested size will also implicitly grow the RAM block.
> >>>>> Shrinking the requested size will currently result in shrinking the RAM
> >>>>> block on the next reboot.
> >>>>> 
> >>>>> So, to trigger growing of a RAM block (assuming requested-size was
> >>>>> smaller before, e.g., 1000M)
> >>>>> 
> >>>>> echo "qom-set vm1 requested-size 6000M" | sudo nc -U $MON
> >>>>> 
> >>>>> To trigger shrinking (assuming requested-size was bigger before)
> >>>>> 
> >>>>> echo "qom-set vm1 requested-size 100M" | sudo nc -U $MON
> >>>>> echo 'system_reset' | sudo nc -U $MON
> >>>>> 
> >>>>> 
> >>>>> Placing these at the right spots during a migration allows to test this
> >>>>> very reliably.
> >>>> 
> >>>> I see, thanks for the context.  The question was majorly about when
> >>>> you say "during postcopy migration (when syncing RAM blocks), while
> >>>> not yet running on the target" - it's not easy to do so imho, because:
> >>> 
> >>> This case is very easy to trigger, even with acpi. Simply have a ram
> >>> block on the source be bigger than one on the target. The sync code
> >>> (migration/ram.c:qemu_ram_resize()) will perform the resize during

[1]

> >>> precopy. Postcopy misses to discard the additional memory.
> > 
> > But when resizing happens during precopy, we should cancel this
> > migration directly?  Hmm?...
> 
> ?
> 
> We are talking about the migration target, not the source. Please have a look at the RAM block size sync code I mentioned. That‘s probably faster than me having to explain it (and obviously failing to do so :) ).

OK finally I noticed you meant migration/ram.c:ram_load_precopy() [1]
not qemu_ram_resize().  And at [2] I think you meant during precopy
migration, not postcopy.  Those are probably the things that made me
confused.  And yes we need to consider this case.  Thanks,
David Hildenbrand Feb. 24, 2020, 8:54 p.m. UTC | #9
> Am 24.02.2020 um 21:04 schrieb Peter Xu <peterx@redhat.com>:
> 
> On Mon, Feb 24, 2020 at 08:34:16PM +0100, David Hildenbrand wrote:
>> 
>> 
>>>> Am 24.02.2020 um 20:19 schrieb Peter Xu <peterx@redhat.com>:
>>> 
>>> On Mon, Feb 24, 2020 at 07:59:10PM +0100, David Hildenbrand wrote:
>>>>> On 24.02.20 19:44, David Hildenbrand wrote:
>>>>> On 24.02.20 18:45, Peter Xu wrote:
>>>>>> On Mon, Feb 24, 2020 at 10:09:19AM +0100, David Hildenbrand wrote:
>>>>>>> On 21.02.20 19:04, Peter Xu wrote:
>>>>>>>> On Fri, Feb 21, 2020 at 05:41:51PM +0100, David Hildenbrand wrote:
>>>>>>>>> I was now able to actually test resizing while migrating. I am using the
>>>>>>>>> prototype of virtio-mem to test (which also makes use of resizable
>>>>>>>>> allocations). Things I was able to reproduce:
>>>>>>>> 
>>>>>>>> The test cases cover quite a lot.  Thanks for doing that.
>>>>>>>> 
>>>>>>>>> - Resize while still running on the migration source. Migration is canceled
>>>>>>>>> -- Test case for "migraton/ram: Handle RAM block resizes during precopy"
>>>>>>>> 
>>>>>>>>> - Resize (grow+shrink) on the migration target during postcopy migration
> 
> [2]
> 
>>>>>>>>> (when syncing RAM blocks), while not yet running on the target
>>>>>>>>> -- Test case for "migration/ram: Discard new RAM when growing RAM blocks
>>>>>>>>>  and the VM is stopped", and overall RAM size synchronization. Seems to
>>>>>>>>>  work just fine.
>>>>>>>> 
>>>>>>>> This won't be able to trigger without virtio-mem, right?
>>>>>>> 
>>>>>>> AFAIK all cases can also be triggered without virtio-mem (not just that
>>>>>>> easily :) ). This case would be "RAM block is bigger on source than on
>>>>>>> destination.".
>>>>>>> 
>>>>>>>> 
>>>>>>>> And I'm also curious on how to test this even with virtio-mem.  Is
>>>>>>>> that a QMP command to extend/shrink virtio-mem?
>>>>>>> 
>>>>>>> Currently, there is a single qom property that can be modifed via
>>>>>>> QMP/HMP - "requested-size". With resizable resizable memory backends,
>>>>>>> increasing the requested size will also implicitly grow the RAM block.
>>>>>>> Shrinking the requested size will currently result in shrinking the RAM
>>>>>>> block on the next reboot.
>>>>>>> 
>>>>>>> So, to trigger growing of a RAM block (assuming requested-size was
>>>>>>> smaller before, e.g., 1000M)
>>>>>>> 
>>>>>>> echo "qom-set vm1 requested-size 6000M" | sudo nc -U $MON
>>>>>>> 
>>>>>>> To trigger shrinking (assuming requested-size was bigger before)
>>>>>>> 
>>>>>>> echo "qom-set vm1 requested-size 100M" | sudo nc -U $MON
>>>>>>> echo 'system_reset' | sudo nc -U $MON
>>>>>>> 
>>>>>>> 
>>>>>>> Placing these at the right spots during a migration allows to test this
>>>>>>> very reliably.
>>>>>> 
>>>>>> I see, thanks for the context.  The question was majorly about when
>>>>>> you say "during postcopy migration (when syncing RAM blocks), while
>>>>>> not yet running on the target" - it's not easy to do so imho, because:
>>>>> 
>>>>> This case is very easy to trigger, even with acpi. Simply have a ram
>>>>> block on the source be bigger than one on the target. The sync code
>>>>> (migration/ram.c:qemu_ram_resize()) will perform the resize during
> 
> [1]
> 
>>>>> precopy. Postcopy misses to discard the additional memory.
>>> 
>>> But when resizing happens during precopy, we should cancel this
>>> migration directly?  Hmm?...
>> 
>> ?
>> 
>> We are talking about the migration target, not the source. Please have a look at the RAM block size sync code I mentioned. That‘s probably faster than me having to explain it (and obviously failing to do so :) ).
> 
> OK finally I noticed you meant migration/ram.c:ram_load_precopy() [1]
> not qemu_ram_resize().

Right, the single invocation of qemu_ram_resize() in that file/function.

> And at [2] I think you meant during precopy
> migration, not postcopy.

The precopy stage when postcopy was advised. Yes, it‘s confusing :)

> Those are probably the things that made me
> confused.  And yes we need to consider this case.  Thanks,

Thanks for having a look!

> 
> -- 
> Peter Xu
>