diff mbox series

[v3] migration: discard non-migratable RAMBlocks

Message ID 20180510220248.10272-1-clg@kaod.org
State New
Headers show
Series [v3] migration: discard non-migratable RAMBlocks | expand

Commit Message

Cédric Le Goater May 10, 2018, 10:02 p.m. UTC
On the POWER9 processor, the XIVE interrupt controller can control
interrupt sources using MMIO to trigger events, to EOI or to turn off
the sources. Priority management and interrupt acknowledgment is also
controlled by MMIO in the presenter sub-engine.

These MMIO regions are exposed to guests in QEMU with a set of 'ram
device' memory mappings, similarly to VFIO, and the VMAs are populated
dynamically with the appropriate pages using a fault handler.

But, these regions are an issue for migration. We need to discard the
associated RAMBlocks from the RAM state on the source VM and let the
destination VM rebuild the memory mappings on the new host in the
post_load() operation just before resuming the system.

To achieve this goal, the following introduces a new RAMBlock flag
RAM_MIGRATABLE which is updated in the vmstate_register_ram() and
vmstate_unregister_ram() routines. This flag is then used by the
migration to identify RAMBlocks to discard on the source. Some checks
are also performed on the destination to make sure nothing invalid was
sent.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---

 Changes sinve v2:

 - added an error_report() in ram_save_host_page() 
 - un/set the RAMBlock RAM_MIGRATABLE directly under vmstate_un/register_ram()
   with some new flag helpers 
 
 exec.c                    | 18 ++++++++++++++++++
 include/exec/cpu-common.h |  3 +++
 migration/ram.c           | 43 +++++++++++++++++++++++++++++++++----------
 migration/savevm.c        |  2 ++
 4 files changed, 56 insertions(+), 10 deletions(-)

Comments

no-reply@patchew.org May 10, 2018, 10:07 p.m. UTC | #1
Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20180510220248.10272-1-clg@kaod.org
Subject: [Qemu-devel] [PATCH v3] migration: discard non-migratable RAMBlocks

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
    echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
    if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
        failed=1
        echo
    fi
    n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]               patchew/20180510220248.10272-1-clg@kaod.org -> patchew/20180510220248.10272-1-clg@kaod.org
Switched to a new branch 'test'
0fb68ae2eb migration: discard non-migratable RAMBlocks

=== OUTPUT BEGIN ===
Checking PATCH 1/1: migration: discard non-migratable RAMBlocks...
ERROR: Macros with multiple statements should be enclosed in a do - while loop
#92: FILE: migration/ram.c:191:
+#define RAMBLOCK_FOREACH_MIGRATABLE(block)             \
+    RAMBLOCK_FOREACH(block)                            \
+        if (!qemu_ram_is_migratable(block)) {} else

ERROR: trailing statements should be on next line
#94: FILE: migration/ram.c:193:
+        if (!qemu_ram_is_migratable(block)) {} else

total: 2 errors, 0 warnings, 179 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
Peter Xu May 11, 2018, 4:59 a.m. UTC | #2
On Fri, May 11, 2018 at 12:02:48AM +0200, Cédric Le Goater wrote:
> On the POWER9 processor, the XIVE interrupt controller can control
> interrupt sources using MMIO to trigger events, to EOI or to turn off
> the sources. Priority management and interrupt acknowledgment is also
> controlled by MMIO in the presenter sub-engine.
> 
> These MMIO regions are exposed to guests in QEMU with a set of 'ram
> device' memory mappings, similarly to VFIO, and the VMAs are populated
> dynamically with the appropriate pages using a fault handler.
> 
> But, these regions are an issue for migration. We need to discard the
> associated RAMBlocks from the RAM state on the source VM and let the
> destination VM rebuild the memory mappings on the new host in the
> post_load() operation just before resuming the system.
> 
> To achieve this goal, the following introduces a new RAMBlock flag
> RAM_MIGRATABLE which is updated in the vmstate_register_ram() and
> vmstate_unregister_ram() routines. This flag is then used by the
> migration to identify RAMBlocks to discard on the source. Some checks
> are also performed on the destination to make sure nothing invalid was
> sent.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
> 
>  Changes sinve v2:
> 
>  - added an error_report() in ram_save_host_page() 
>  - un/set the RAMBlock RAM_MIGRATABLE directly under vmstate_un/register_ram()
>    with some new flag helpers 
>  
>  exec.c                    | 18 ++++++++++++++++++
>  include/exec/cpu-common.h |  3 +++
>  migration/ram.c           | 43 +++++++++++++++++++++++++++++++++----------
>  migration/savevm.c        |  2 ++
>  4 files changed, 56 insertions(+), 10 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index c7fcefa851b2..079c5c8bab7b 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -104,6 +104,9 @@ static MemoryRegion io_mem_unassigned;
>   * (Set during postcopy)
>   */
>  #define RAM_UF_ZEROPAGE (1 << 3)
> +
> +/* RAM can be migrated */
> +#define RAM_MIGRATABLE (1 << 4)
>  #endif
>  
>  #ifdef TARGET_PAGE_BITS_VARY
> @@ -1797,6 +1800,21 @@ void qemu_ram_set_uf_zeroable(RAMBlock *rb)
>      rb->flags |= RAM_UF_ZEROPAGE;
>  }
>  
> +bool qemu_ram_is_migratable(RAMBlock *rb)
> +{
> +    return rb->flags & RAM_MIGRATABLE;
> +}
> +
> +void qemu_ram_set_migratable(RAMBlock *rb)
> +{
> +    rb->flags |= RAM_MIGRATABLE;
> +}
> +
> +void qemu_ram_unset_migratable(RAMBlock *rb)
> +{
> +    rb->flags &= ~RAM_MIGRATABLE;
> +}
> +
>  /* Called with iothread lock held.  */
>  void qemu_ram_set_idstr(RAMBlock *new_block, const char *name, DeviceState *dev)
>  {
> diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
> index 24d335f95d45..488288fce959 100644
> --- a/include/exec/cpu-common.h
> +++ b/include/exec/cpu-common.h
> @@ -75,6 +75,9 @@ const char *qemu_ram_get_idstr(RAMBlock *rb);
>  bool qemu_ram_is_shared(RAMBlock *rb);
>  bool qemu_ram_is_uf_zeroable(RAMBlock *rb);
>  void qemu_ram_set_uf_zeroable(RAMBlock *rb);
> +bool qemu_ram_is_migratable(RAMBlock *rb);
> +void qemu_ram_set_migratable(RAMBlock *rb);
> +void qemu_ram_unset_migratable(RAMBlock *rb);
>  
>  size_t qemu_ram_pagesize(RAMBlock *block);
>  size_t qemu_ram_pagesize_largest(void);
> diff --git a/migration/ram.c b/migration/ram.c
> index 912810c18e0f..dfdec78ecb03 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -187,6 +187,11 @@ void ramblock_recv_bitmap_set_range(RAMBlock *rb, void *host_addr,
>                        nr);
>  }
>  
> +/* Should be holding either ram_list.mutex, or the RCU lock. */
> +#define RAMBLOCK_FOREACH_MIGRATABLE(block)             \
> +    RAMBLOCK_FOREACH(block)                            \
> +        if (!qemu_ram_is_migratable(block)) {} else

Pure question: is here an explicit use of "else" rather than:

      if (qemu_ram_is_migratable(block))

?

> +
>  /*
>   * An outstanding page request, on the source, having been received
>   * and queued
> @@ -813,6 +818,10 @@ unsigned long migration_bitmap_find_dirty(RAMState *rs, RAMBlock *rb,
>      unsigned long *bitmap = rb->bmap;
>      unsigned long next;
>  
> +    if (!qemu_ram_is_migratable(rb)) {
> +        return size;
> +    }

(I would change the caller side at find_dirty_block(), we can just
 skip that if not migratable.  But this is also fine to me)

> +
>      if (rs->ram_bulk_stage && start > 0) {
>          next = start + 1;
>      } else {
> @@ -858,7 +867,7 @@ uint64_t ram_pagesize_summary(void)
>      RAMBlock *block;
>      uint64_t summary = 0;
>  
> -    RAMBLOCK_FOREACH(block) {
> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
>          summary |= block->page_size;
>      }
>  
> @@ -882,7 +891,7 @@ static void migration_bitmap_sync(RAMState *rs)
>  
>      qemu_mutex_lock(&rs->bitmap_mutex);
>      rcu_read_lock();
> -    RAMBLOCK_FOREACH(block) {
> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
>          migration_bitmap_sync_range(rs, block, 0, block->used_length);
>      }
>      rcu_read_unlock();
> @@ -1521,6 +1530,11 @@ static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss,
>      size_t pagesize_bits =
>          qemu_ram_pagesize(pss->block) >> TARGET_PAGE_BITS;
>  
> +    if (!qemu_ram_is_migratable(pss->block)) {
> +        error_report("block %s should not be migrated !", pss->block->idstr);
> +        return 0;
> +    }
> +
>      do {
>          /* Check the pages is dirty and if it is send it */
>          if (!migration_bitmap_clear_dirty(rs, pss->block, pss->page)) {
> @@ -1619,7 +1633,7 @@ uint64_t ram_bytes_total(void)
>      uint64_t total = 0;
>  
>      rcu_read_lock();
> -    RAMBLOCK_FOREACH(block) {
> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
>          total += block->used_length;
>      }
>      rcu_read_unlock();
> @@ -1674,7 +1688,7 @@ static void ram_save_cleanup(void *opaque)
>       */
>      memory_global_dirty_log_stop();
>  
> -    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
>          g_free(block->bmap);
>          block->bmap = NULL;
>          g_free(block->unsentmap);
> @@ -1737,7 +1751,7 @@ void ram_postcopy_migrated_memory_release(MigrationState *ms)
>  {
>      struct RAMBlock *block;
>  
> -    RAMBLOCK_FOREACH(block) {
> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
>          unsigned long *bitmap = block->bmap;
>          unsigned long range = block->used_length >> TARGET_PAGE_BITS;
>          unsigned long run_start = find_next_zero_bit(bitmap, range, 0);
> @@ -1815,7 +1829,7 @@ static int postcopy_each_ram_send_discard(MigrationState *ms)
>      struct RAMBlock *block;
>      int ret;
>  
> -    RAMBLOCK_FOREACH(block) {
> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
>          PostcopyDiscardState *pds =
>              postcopy_discard_send_init(ms, block->idstr);
>  
> @@ -2023,7 +2037,7 @@ int ram_postcopy_send_discard_bitmap(MigrationState *ms)
>      rs->last_sent_block = NULL;
>      rs->last_page = 0;
>  
> -    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
>          unsigned long pages = block->used_length >> TARGET_PAGE_BITS;
>          unsigned long *bitmap = block->bmap;
>          unsigned long *unsentmap = block->unsentmap;
> @@ -2182,7 +2196,7 @@ static void ram_list_init_bitmaps(void)
>  
>      /* Skip setting bitmap if there is no RAM */
>      if (ram_bytes_total()) {
> -        QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> +        RAMBLOCK_FOREACH_MIGRATABLE(block) {
>              pages = block->max_length >> TARGET_PAGE_BITS;
>              block->bmap = bitmap_new(pages);
>              bitmap_set(block->bmap, 0, pages);
> @@ -2263,7 +2277,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>  
>      qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE);
>  
> -    RAMBLOCK_FOREACH(block) {
> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
>          qemu_put_byte(f, strlen(block->idstr));
>          qemu_put_buffer(f, (uint8_t *)block->idstr, strlen(block->idstr));
>          qemu_put_be64(f, block->used_length);
> @@ -2507,6 +2521,11 @@ static inline RAMBlock *ram_block_from_stream(QEMUFile *f, int flags)
>          return NULL;
>      }
>  
> +    if (!qemu_ram_is_migratable(block)) {
> +        error_report("block %s should not be migrated !", id);
> +        return NULL;
> +    }
> +
>      return block;
>  }
>  
> @@ -3011,7 +3030,11 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>                  length = qemu_get_be64(f);
>  
>                  block = qemu_ram_block_by_name(id);
> -                if (block) {
> +                if (block && !qemu_ram_is_migratable(block)) {
> +                    error_report("block %s should not be migrated !", id);
> +                    ret = -EINVAL;
> +

(extra new line)

I saw quite a few of such checks in the patch that dumps errors.
Frankly speaking I would still prefer assertions if we are sure it's
programming errors, and I believe incorrect marking of RAMBlocks are.
I think I understand the concern here that we are just careful with
migration codes to avoid VM crash but at least here we are doing
precopy loading and we won't have problem to use assertion since it
won't crash the source (and after all we are returning -EINVAL so I
suppose the VM will quit too, just like a crash here).

I don't have strong opinion on this and current patch actually looks
good to me; I only raise this question up.

Thanks,

> +                } else if (block) {
>                      if (length != block->used_length) {
>                          Error *local_err = NULL;
>  
> diff --git a/migration/savevm.c b/migration/savevm.c
> index e2be02afe42c..9ebfba738ea4 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -2501,11 +2501,13 @@ void vmstate_register_ram(MemoryRegion *mr, DeviceState *dev)
>  {
>      qemu_ram_set_idstr(mr->ram_block,
>                         memory_region_name(mr), dev);
> +    qemu_ram_set_migratable(mr->ram_block);
>  }
>  
>  void vmstate_unregister_ram(MemoryRegion *mr, DeviceState *dev)
>  {
>      qemu_ram_unset_idstr(mr->ram_block);
> +    qemu_ram_unset_migratable(mr->ram_block);
>  }
>  
>  void vmstate_register_ram_global(MemoryRegion *mr)
> -- 
> 2.13.6
> 
>
Cédric Le Goater May 11, 2018, 6:21 a.m. UTC | #3
Hello David,

Yesterday evening, I forgot to add the quote requested by Peter and
to add the MIPS maintainers. 

Could you please add the paragraph below if the patch is accepted ? 
I think it's worth resending a v4 for that.

Thanks,

C.

On 05/11/2018 12:02 AM, Cédric Le Goater wrote:
> On the POWER9 processor, the XIVE interrupt controller can control
> interrupt sources using MMIO to trigger events, to EOI or to turn off
> the sources. Priority management and interrupt acknowledgment is also
> controlled by MMIO in the presenter sub-engine.
> 
> These MMIO regions are exposed to guests in QEMU with a set of 'ram
> device' memory mappings, similarly to VFIO, and the VMAs are populated
> dynamically with the appropriate pages using a fault handler.
> 
> But, these regions are an issue for migration. We need to discard the
> associated RAMBlocks from the RAM state on the source VM and let the
> destination VM rebuild the memory mappings on the new host in the
> post_load() operation just before resuming the system.
> 
> To achieve this goal, the following introduces a new RAMBlock flag
> RAM_MIGRATABLE which is updated in the vmstate_register_ram() and
> vmstate_unregister_ram() routines. This flag is then used by the
> migration to identify RAMBlocks to discard on the source. Some checks
> are also performed on the destination to make sure nothing invalid was
> sent.

This change impacts the boston, malta and jazz mips boards for 
which migration compatibility is broken.

> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
> 
>  Changes sinve v2:
> 
>  - added an error_report() in ram_save_host_page() 
>  - un/set the RAMBlock RAM_MIGRATABLE directly under vmstate_un/register_ram()
>    with some new flag helpers 
>  
>  exec.c                    | 18 ++++++++++++++++++
>  include/exec/cpu-common.h |  3 +++
>  migration/ram.c           | 43 +++++++++++++++++++++++++++++++++----------
>  migration/savevm.c        |  2 ++
>  4 files changed, 56 insertions(+), 10 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index c7fcefa851b2..079c5c8bab7b 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -104,6 +104,9 @@ static MemoryRegion io_mem_unassigned;
>   * (Set during postcopy)
>   */
>  #define RAM_UF_ZEROPAGE (1 << 3)
> +
> +/* RAM can be migrated */
> +#define RAM_MIGRATABLE (1 << 4)
>  #endif
>  
>  #ifdef TARGET_PAGE_BITS_VARY
> @@ -1797,6 +1800,21 @@ void qemu_ram_set_uf_zeroable(RAMBlock *rb)
>      rb->flags |= RAM_UF_ZEROPAGE;
>  }
>  
> +bool qemu_ram_is_migratable(RAMBlock *rb)
> +{
> +    return rb->flags & RAM_MIGRATABLE;
> +}
> +
> +void qemu_ram_set_migratable(RAMBlock *rb)
> +{
> +    rb->flags |= RAM_MIGRATABLE;
> +}
> +
> +void qemu_ram_unset_migratable(RAMBlock *rb)
> +{
> +    rb->flags &= ~RAM_MIGRATABLE;
> +}
> +
>  /* Called with iothread lock held.  */
>  void qemu_ram_set_idstr(RAMBlock *new_block, const char *name, DeviceState *dev)
>  {
> diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
> index 24d335f95d45..488288fce959 100644
> --- a/include/exec/cpu-common.h
> +++ b/include/exec/cpu-common.h
> @@ -75,6 +75,9 @@ const char *qemu_ram_get_idstr(RAMBlock *rb);
>  bool qemu_ram_is_shared(RAMBlock *rb);
>  bool qemu_ram_is_uf_zeroable(RAMBlock *rb);
>  void qemu_ram_set_uf_zeroable(RAMBlock *rb);
> +bool qemu_ram_is_migratable(RAMBlock *rb);
> +void qemu_ram_set_migratable(RAMBlock *rb);
> +void qemu_ram_unset_migratable(RAMBlock *rb);
>  
>  size_t qemu_ram_pagesize(RAMBlock *block);
>  size_t qemu_ram_pagesize_largest(void);
> diff --git a/migration/ram.c b/migration/ram.c
> index 912810c18e0f..dfdec78ecb03 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -187,6 +187,11 @@ void ramblock_recv_bitmap_set_range(RAMBlock *rb, void *host_addr,
>                        nr);
>  }
>  
> +/* Should be holding either ram_list.mutex, or the RCU lock. */
> +#define RAMBLOCK_FOREACH_MIGRATABLE(block)             \
> +    RAMBLOCK_FOREACH(block)                            \
> +        if (!qemu_ram_is_migratable(block)) {} else
> +
>  /*
>   * An outstanding page request, on the source, having been received
>   * and queued
> @@ -813,6 +818,10 @@ unsigned long migration_bitmap_find_dirty(RAMState *rs, RAMBlock *rb,
>      unsigned long *bitmap = rb->bmap;
>      unsigned long next;
>  
> +    if (!qemu_ram_is_migratable(rb)) {
> +        return size;
> +    }
> +
>      if (rs->ram_bulk_stage && start > 0) {
>          next = start + 1;
>      } else {
> @@ -858,7 +867,7 @@ uint64_t ram_pagesize_summary(void)
>      RAMBlock *block;
>      uint64_t summary = 0;
>  
> -    RAMBLOCK_FOREACH(block) {
> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
>          summary |= block->page_size;
>      }
>  
> @@ -882,7 +891,7 @@ static void migration_bitmap_sync(RAMState *rs)
>  
>      qemu_mutex_lock(&rs->bitmap_mutex);
>      rcu_read_lock();
> -    RAMBLOCK_FOREACH(block) {
> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
>          migration_bitmap_sync_range(rs, block, 0, block->used_length);
>      }
>      rcu_read_unlock();
> @@ -1521,6 +1530,11 @@ static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss,
>      size_t pagesize_bits =
>          qemu_ram_pagesize(pss->block) >> TARGET_PAGE_BITS;
>  
> +    if (!qemu_ram_is_migratable(pss->block)) {
> +        error_report("block %s should not be migrated !", pss->block->idstr);
> +        return 0;
> +    }
> +
>      do {
>          /* Check the pages is dirty and if it is send it */
>          if (!migration_bitmap_clear_dirty(rs, pss->block, pss->page)) {
> @@ -1619,7 +1633,7 @@ uint64_t ram_bytes_total(void)
>      uint64_t total = 0;
>  
>      rcu_read_lock();
> -    RAMBLOCK_FOREACH(block) {
> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
>          total += block->used_length;
>      }
>      rcu_read_unlock();
> @@ -1674,7 +1688,7 @@ static void ram_save_cleanup(void *opaque)
>       */
>      memory_global_dirty_log_stop();
>  
> -    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
>          g_free(block->bmap);
>          block->bmap = NULL;
>          g_free(block->unsentmap);
> @@ -1737,7 +1751,7 @@ void ram_postcopy_migrated_memory_release(MigrationState *ms)
>  {
>      struct RAMBlock *block;
>  
> -    RAMBLOCK_FOREACH(block) {
> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
>          unsigned long *bitmap = block->bmap;
>          unsigned long range = block->used_length >> TARGET_PAGE_BITS;
>          unsigned long run_start = find_next_zero_bit(bitmap, range, 0);
> @@ -1815,7 +1829,7 @@ static int postcopy_each_ram_send_discard(MigrationState *ms)
>      struct RAMBlock *block;
>      int ret;
>  
> -    RAMBLOCK_FOREACH(block) {
> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
>          PostcopyDiscardState *pds =
>              postcopy_discard_send_init(ms, block->idstr);
>  
> @@ -2023,7 +2037,7 @@ int ram_postcopy_send_discard_bitmap(MigrationState *ms)
>      rs->last_sent_block = NULL;
>      rs->last_page = 0;
>  
> -    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
>          unsigned long pages = block->used_length >> TARGET_PAGE_BITS;
>          unsigned long *bitmap = block->bmap;
>          unsigned long *unsentmap = block->unsentmap;
> @@ -2182,7 +2196,7 @@ static void ram_list_init_bitmaps(void)
>  
>      /* Skip setting bitmap if there is no RAM */
>      if (ram_bytes_total()) {
> -        QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> +        RAMBLOCK_FOREACH_MIGRATABLE(block) {
>              pages = block->max_length >> TARGET_PAGE_BITS;
>              block->bmap = bitmap_new(pages);
>              bitmap_set(block->bmap, 0, pages);
> @@ -2263,7 +2277,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>  
>      qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE);
>  
> -    RAMBLOCK_FOREACH(block) {
> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
>          qemu_put_byte(f, strlen(block->idstr));
>          qemu_put_buffer(f, (uint8_t *)block->idstr, strlen(block->idstr));
>          qemu_put_be64(f, block->used_length);
> @@ -2507,6 +2521,11 @@ static inline RAMBlock *ram_block_from_stream(QEMUFile *f, int flags)
>          return NULL;
>      }
>  
> +    if (!qemu_ram_is_migratable(block)) {
> +        error_report("block %s should not be migrated !", id);
> +        return NULL;
> +    }
> +
>      return block;
>  }
>  
> @@ -3011,7 +3030,11 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>                  length = qemu_get_be64(f);
>  
>                  block = qemu_ram_block_by_name(id);
> -                if (block) {
> +                if (block && !qemu_ram_is_migratable(block)) {
> +                    error_report("block %s should not be migrated !", id);
> +                    ret = -EINVAL;
> +
> +                } else if (block) {
>                      if (length != block->used_length) {
>                          Error *local_err = NULL;
>  
> diff --git a/migration/savevm.c b/migration/savevm.c
> index e2be02afe42c..9ebfba738ea4 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -2501,11 +2501,13 @@ void vmstate_register_ram(MemoryRegion *mr, DeviceState *dev)
>  {
>      qemu_ram_set_idstr(mr->ram_block,
>                         memory_region_name(mr), dev);
> +    qemu_ram_set_migratable(mr->ram_block);
>  }
>  
>  void vmstate_unregister_ram(MemoryRegion *mr, DeviceState *dev)
>  {
>      qemu_ram_unset_idstr(mr->ram_block);
> +    qemu_ram_unset_migratable(mr->ram_block);
>  }
>  
>  void vmstate_register_ram_global(MemoryRegion *mr)
>
Cédric Le Goater May 11, 2018, 6:58 a.m. UTC | #4
On 05/11/2018 06:59 AM, Peter Xu wrote:
> On Fri, May 11, 2018 at 12:02:48AM +0200, Cédric Le Goater wrote:
>> On the POWER9 processor, the XIVE interrupt controller can control
>> interrupt sources using MMIO to trigger events, to EOI or to turn off
>> the sources. Priority management and interrupt acknowledgment is also
>> controlled by MMIO in the presenter sub-engine.
>>
>> These MMIO regions are exposed to guests in QEMU with a set of 'ram
>> device' memory mappings, similarly to VFIO, and the VMAs are populated
>> dynamically with the appropriate pages using a fault handler.
>>
>> But, these regions are an issue for migration. We need to discard the
>> associated RAMBlocks from the RAM state on the source VM and let the
>> destination VM rebuild the memory mappings on the new host in the
>> post_load() operation just before resuming the system.
>>
>> To achieve this goal, the following introduces a new RAMBlock flag
>> RAM_MIGRATABLE which is updated in the vmstate_register_ram() and
>> vmstate_unregister_ram() routines. This flag is then used by the
>> migration to identify RAMBlocks to discard on the source. Some checks
>> are also performed on the destination to make sure nothing invalid was
>> sent.
>>
>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>> ---
>>
>>  Changes sinve v2:
>>
>>  - added an error_report() in ram_save_host_page() 
>>  - un/set the RAMBlock RAM_MIGRATABLE directly under vmstate_un/register_ram()
>>    with some new flag helpers 
>>  
>>  exec.c                    | 18 ++++++++++++++++++
>>  include/exec/cpu-common.h |  3 +++
>>  migration/ram.c           | 43 +++++++++++++++++++++++++++++++++----------
>>  migration/savevm.c        |  2 ++
>>  4 files changed, 56 insertions(+), 10 deletions(-)
>>
>> diff --git a/exec.c b/exec.c
>> index c7fcefa851b2..079c5c8bab7b 100644
>> --- a/exec.c
>> +++ b/exec.c
>> @@ -104,6 +104,9 @@ static MemoryRegion io_mem_unassigned;
>>   * (Set during postcopy)
>>   */
>>  #define RAM_UF_ZEROPAGE (1 << 3)
>> +
>> +/* RAM can be migrated */
>> +#define RAM_MIGRATABLE (1 << 4)
>>  #endif
>>  
>>  #ifdef TARGET_PAGE_BITS_VARY
>> @@ -1797,6 +1800,21 @@ void qemu_ram_set_uf_zeroable(RAMBlock *rb)
>>      rb->flags |= RAM_UF_ZEROPAGE;
>>  }
>>  
>> +bool qemu_ram_is_migratable(RAMBlock *rb)
>> +{
>> +    return rb->flags & RAM_MIGRATABLE;
>> +}
>> +
>> +void qemu_ram_set_migratable(RAMBlock *rb)
>> +{
>> +    rb->flags |= RAM_MIGRATABLE;
>> +}
>> +
>> +void qemu_ram_unset_migratable(RAMBlock *rb)
>> +{
>> +    rb->flags &= ~RAM_MIGRATABLE;
>> +}
>> +
>>  /* Called with iothread lock held.  */
>>  void qemu_ram_set_idstr(RAMBlock *new_block, const char *name, DeviceState *dev)
>>  {
>> diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
>> index 24d335f95d45..488288fce959 100644
>> --- a/include/exec/cpu-common.h
>> +++ b/include/exec/cpu-common.h
>> @@ -75,6 +75,9 @@ const char *qemu_ram_get_idstr(RAMBlock *rb);
>>  bool qemu_ram_is_shared(RAMBlock *rb);
>>  bool qemu_ram_is_uf_zeroable(RAMBlock *rb);
>>  void qemu_ram_set_uf_zeroable(RAMBlock *rb);
>> +bool qemu_ram_is_migratable(RAMBlock *rb);
>> +void qemu_ram_set_migratable(RAMBlock *rb);
>> +void qemu_ram_unset_migratable(RAMBlock *rb);
>>  
>>  size_t qemu_ram_pagesize(RAMBlock *block);
>>  size_t qemu_ram_pagesize_largest(void);
>> diff --git a/migration/ram.c b/migration/ram.c
>> index 912810c18e0f..dfdec78ecb03 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -187,6 +187,11 @@ void ramblock_recv_bitmap_set_range(RAMBlock *rb, void *host_addr,
>>                        nr);
>>  }
>>  
>> +/* Should be holding either ram_list.mutex, or the RCU lock. */
>> +#define RAMBLOCK_FOREACH_MIGRATABLE(block)             \
>> +    RAMBLOCK_FOREACH(block)                            \
>> +        if (!qemu_ram_is_migratable(block)) {} else
> 
> Pure question: is here an explicit use of "else" rather than:
> 
>       if (qemu_ram_is_migratable(block))
> 
> ?

The above is an imperfect macro 'trick' trying to handle code that 
would have omitted {}. It provides some protection.
   
>> +
>>  /*
>>   * An outstanding page request, on the source, having been received
>>   * and queued
>> @@ -813,6 +818,10 @@ unsigned long migration_bitmap_find_dirty(RAMState *rs, RAMBlock *rb,
>>      unsigned long *bitmap = rb->bmap;
>>      unsigned long next;
>>  
>> +    if (!qemu_ram_is_migratable(rb)) {
>> +        return size;
>> +    }
> 
> (I would change the caller side at find_dirty_block(), we can just
>  skip that if not migratable.  But this is also fine to me)

Hmm. The do {} while() loop and callees in ram_find_and_save_block() 
is a bit complex to follow. I stumbled a few time when trying to 
change it.

>> +
>>      if (rs->ram_bulk_stage && start > 0) {
>>          next = start + 1;
>>      } else {
>> @@ -858,7 +867,7 @@ uint64_t ram_pagesize_summary(void)
>>      RAMBlock *block;
>>      uint64_t summary = 0;
>>  
>> -    RAMBLOCK_FOREACH(block) {
>> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
>>          summary |= block->page_size;
>>      }
>>  
>> @@ -882,7 +891,7 @@ static void migration_bitmap_sync(RAMState *rs)
>>  
>>      qemu_mutex_lock(&rs->bitmap_mutex);
>>      rcu_read_lock();
>> -    RAMBLOCK_FOREACH(block) {
>> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
>>          migration_bitmap_sync_range(rs, block, 0, block->used_length);
>>      }
>>      rcu_read_unlock();
>> @@ -1521,6 +1530,11 @@ static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss,
>>      size_t pagesize_bits =
>>          qemu_ram_pagesize(pss->block) >> TARGET_PAGE_BITS;
>>  
>> +    if (!qemu_ram_is_migratable(pss->block)) {
>> +        error_report("block %s should not be migrated !", pss->block->idstr);
>> +        return 0;
>> +    }
>> +
>>      do {
>>          /* Check the pages is dirty and if it is send it */
>>          if (!migration_bitmap_clear_dirty(rs, pss->block, pss->page)) {
>> @@ -1619,7 +1633,7 @@ uint64_t ram_bytes_total(void)
>>      uint64_t total = 0;
>>  
>>      rcu_read_lock();
>> -    RAMBLOCK_FOREACH(block) {
>> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
>>          total += block->used_length;
>>      }
>>      rcu_read_unlock();
>> @@ -1674,7 +1688,7 @@ static void ram_save_cleanup(void *opaque)
>>       */
>>      memory_global_dirty_log_stop();
>>  
>> -    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
>>          g_free(block->bmap);
>>          block->bmap = NULL;
>>          g_free(block->unsentmap);
>> @@ -1737,7 +1751,7 @@ void ram_postcopy_migrated_memory_release(MigrationState *ms)
>>  {
>>      struct RAMBlock *block;
>>  
>> -    RAMBLOCK_FOREACH(block) {
>> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
>>          unsigned long *bitmap = block->bmap;
>>          unsigned long range = block->used_length >> TARGET_PAGE_BITS;
>>          unsigned long run_start = find_next_zero_bit(bitmap, range, 0);
>> @@ -1815,7 +1829,7 @@ static int postcopy_each_ram_send_discard(MigrationState *ms)
>>      struct RAMBlock *block;
>>      int ret;
>>  
>> -    RAMBLOCK_FOREACH(block) {
>> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
>>          PostcopyDiscardState *pds =
>>              postcopy_discard_send_init(ms, block->idstr);
>>  
>> @@ -2023,7 +2037,7 @@ int ram_postcopy_send_discard_bitmap(MigrationState *ms)
>>      rs->last_sent_block = NULL;
>>      rs->last_page = 0;
>>  
>> -    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
>>          unsigned long pages = block->used_length >> TARGET_PAGE_BITS;
>>          unsigned long *bitmap = block->bmap;
>>          unsigned long *unsentmap = block->unsentmap;
>> @@ -2182,7 +2196,7 @@ static void ram_list_init_bitmaps(void)
>>  
>>      /* Skip setting bitmap if there is no RAM */
>>      if (ram_bytes_total()) {
>> -        QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>> +        RAMBLOCK_FOREACH_MIGRATABLE(block) {
>>              pages = block->max_length >> TARGET_PAGE_BITS;
>>              block->bmap = bitmap_new(pages);
>>              bitmap_set(block->bmap, 0, pages);
>> @@ -2263,7 +2277,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>>  
>>      qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE);
>>  
>> -    RAMBLOCK_FOREACH(block) {
>> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
>>          qemu_put_byte(f, strlen(block->idstr));
>>          qemu_put_buffer(f, (uint8_t *)block->idstr, strlen(block->idstr));
>>          qemu_put_be64(f, block->used_length);
>> @@ -2507,6 +2521,11 @@ static inline RAMBlock *ram_block_from_stream(QEMUFile *f, int flags)
>>          return NULL;
>>      }
>>  
>> +    if (!qemu_ram_is_migratable(block)) {
>> +        error_report("block %s should not be migrated !", id);
>> +        return NULL;
>> +    }
>> +
>>      return block;
>>  }
>>  
>> @@ -3011,7 +3030,11 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>>                  length = qemu_get_be64(f);
>>  
>>                  block = qemu_ram_block_by_name(id);
>> -                if (block) {
>> +                if (block && !qemu_ram_is_migratable(block)) {
>> +                    error_report("block %s should not be migrated !", id);
>> +                    ret = -EINVAL;
>> +
> 
> (extra new line)

ah. yes indeed.

> I saw quite a few of such checks in the patch that dumps errors.
> Frankly speaking I would still prefer assertions if we are sure it's
> programming errors, and I believe incorrect marking of RAMBlocks are.
> I think I understand the concern here that we are just careful with
> migration codes to avoid VM crash but at least here we are doing
> precopy loading and we won't have problem to use assertion since it
> won't crash the source (and after all we are returning -EINVAL so I
> suppose the VM will quit too, just like a crash here).

It is true that QEMU would be receiving invalid state from the source
in this cases but asserting would stop migration on the target quite 
brutally. It seems excessive for bad input.

Thanks,

C.

> 
> I don't have strong opinion on this and current patch actually looks
> good to me; I only raise this question up.
> 
> Thanks,
> 
>> +                } else if (block) {
>>                      if (length != block->used_length) {
>>                          Error *local_err = NULL;
>>  
>> diff --git a/migration/savevm.c b/migration/savevm.c
>> index e2be02afe42c..9ebfba738ea4 100644
>> --- a/migration/savevm.c
>> +++ b/migration/savevm.c
>> @@ -2501,11 +2501,13 @@ void vmstate_register_ram(MemoryRegion *mr, DeviceState *dev)
>>  {
>>      qemu_ram_set_idstr(mr->ram_block,
>>                         memory_region_name(mr), dev);
>> +    qemu_ram_set_migratable(mr->ram_block);
>>  }
>>  
>>  void vmstate_unregister_ram(MemoryRegion *mr, DeviceState *dev)
>>  {
>>      qemu_ram_unset_idstr(mr->ram_block);
>> +    qemu_ram_unset_migratable(mr->ram_block);
>>  }
>>  
>>  void vmstate_register_ram_global(MemoryRegion *mr)
>> -- 
>> 2.13.6
>>
>>
>
Dr. David Alan Gilbert May 11, 2018, 5:55 p.m. UTC | #5
* Cédric Le Goater (clg@kaod.org) wrote:
> On the POWER9 processor, the XIVE interrupt controller can control
> interrupt sources using MMIO to trigger events, to EOI or to turn off
> the sources. Priority management and interrupt acknowledgment is also
> controlled by MMIO in the presenter sub-engine.
> 
> These MMIO regions are exposed to guests in QEMU with a set of 'ram
> device' memory mappings, similarly to VFIO, and the VMAs are populated
> dynamically with the appropriate pages using a fault handler.
> 
> But, these regions are an issue for migration. We need to discard the
> associated RAMBlocks from the RAM state on the source VM and let the
> destination VM rebuild the memory mappings on the new host in the
> post_load() operation just before resuming the system.
> 
> To achieve this goal, the following introduces a new RAMBlock flag
> RAM_MIGRATABLE which is updated in the vmstate_register_ram() and
> vmstate_unregister_ram() routines. This flag is then used by the
> migration to identify RAMBlocks to discard on the source. Some checks
> are also performed on the destination to make sure nothing invalid was
> sent.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>

I was about to add Review-by when I noticed that we use
qemu_ram_foeach_block in a number of places, once in migration/ram.c
and a bunch in migration/postcopy-ram.c

Sorry, I guess we need a version of that as well (or a flag).

Dave


> ---
> 
>  Changes sinve v2:
> 
>  - added an error_report() in ram_save_host_page() 
>  - un/set the RAMBlock RAM_MIGRATABLE directly under vmstate_un/register_ram()
>    with some new flag helpers 
>  
>  exec.c                    | 18 ++++++++++++++++++
>  include/exec/cpu-common.h |  3 +++
>  migration/ram.c           | 43 +++++++++++++++++++++++++++++++++----------
>  migration/savevm.c        |  2 ++
>  4 files changed, 56 insertions(+), 10 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index c7fcefa851b2..079c5c8bab7b 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -104,6 +104,9 @@ static MemoryRegion io_mem_unassigned;
>   * (Set during postcopy)
>   */
>  #define RAM_UF_ZEROPAGE (1 << 3)
> +
> +/* RAM can be migrated */
> +#define RAM_MIGRATABLE (1 << 4)
>  #endif
>  
>  #ifdef TARGET_PAGE_BITS_VARY
> @@ -1797,6 +1800,21 @@ void qemu_ram_set_uf_zeroable(RAMBlock *rb)
>      rb->flags |= RAM_UF_ZEROPAGE;
>  }
>  
> +bool qemu_ram_is_migratable(RAMBlock *rb)
> +{
> +    return rb->flags & RAM_MIGRATABLE;
> +}
> +
> +void qemu_ram_set_migratable(RAMBlock *rb)
> +{
> +    rb->flags |= RAM_MIGRATABLE;
> +}
> +
> +void qemu_ram_unset_migratable(RAMBlock *rb)
> +{
> +    rb->flags &= ~RAM_MIGRATABLE;
> +}
> +
>  /* Called with iothread lock held.  */
>  void qemu_ram_set_idstr(RAMBlock *new_block, const char *name, DeviceState *dev)
>  {
> diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
> index 24d335f95d45..488288fce959 100644
> --- a/include/exec/cpu-common.h
> +++ b/include/exec/cpu-common.h
> @@ -75,6 +75,9 @@ const char *qemu_ram_get_idstr(RAMBlock *rb);
>  bool qemu_ram_is_shared(RAMBlock *rb);
>  bool qemu_ram_is_uf_zeroable(RAMBlock *rb);
>  void qemu_ram_set_uf_zeroable(RAMBlock *rb);
> +bool qemu_ram_is_migratable(RAMBlock *rb);
> +void qemu_ram_set_migratable(RAMBlock *rb);
> +void qemu_ram_unset_migratable(RAMBlock *rb);
>  
>  size_t qemu_ram_pagesize(RAMBlock *block);
>  size_t qemu_ram_pagesize_largest(void);
> diff --git a/migration/ram.c b/migration/ram.c
> index 912810c18e0f..dfdec78ecb03 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -187,6 +187,11 @@ void ramblock_recv_bitmap_set_range(RAMBlock *rb, void *host_addr,
>                        nr);
>  }
>  
> +/* Should be holding either ram_list.mutex, or the RCU lock. */
> +#define RAMBLOCK_FOREACH_MIGRATABLE(block)             \
> +    RAMBLOCK_FOREACH(block)                            \
> +        if (!qemu_ram_is_migratable(block)) {} else
> +
>  /*
>   * An outstanding page request, on the source, having been received
>   * and queued
> @@ -813,6 +818,10 @@ unsigned long migration_bitmap_find_dirty(RAMState *rs, RAMBlock *rb,
>      unsigned long *bitmap = rb->bmap;
>      unsigned long next;
>  
> +    if (!qemu_ram_is_migratable(rb)) {
> +        return size;
> +    }
> +
>      if (rs->ram_bulk_stage && start > 0) {
>          next = start + 1;
>      } else {
> @@ -858,7 +867,7 @@ uint64_t ram_pagesize_summary(void)
>      RAMBlock *block;
>      uint64_t summary = 0;
>  
> -    RAMBLOCK_FOREACH(block) {
> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
>          summary |= block->page_size;
>      }
>  
> @@ -882,7 +891,7 @@ static void migration_bitmap_sync(RAMState *rs)
>  
>      qemu_mutex_lock(&rs->bitmap_mutex);
>      rcu_read_lock();
> -    RAMBLOCK_FOREACH(block) {
> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
>          migration_bitmap_sync_range(rs, block, 0, block->used_length);
>      }
>      rcu_read_unlock();
> @@ -1521,6 +1530,11 @@ static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss,
>      size_t pagesize_bits =
>          qemu_ram_pagesize(pss->block) >> TARGET_PAGE_BITS;
>  
> +    if (!qemu_ram_is_migratable(pss->block)) {
> +        error_report("block %s should not be migrated !", pss->block->idstr);
> +        return 0;
> +    }
> +
>      do {
>          /* Check the pages is dirty and if it is send it */
>          if (!migration_bitmap_clear_dirty(rs, pss->block, pss->page)) {
> @@ -1619,7 +1633,7 @@ uint64_t ram_bytes_total(void)
>      uint64_t total = 0;
>  
>      rcu_read_lock();
> -    RAMBLOCK_FOREACH(block) {
> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
>          total += block->used_length;
>      }
>      rcu_read_unlock();
> @@ -1674,7 +1688,7 @@ static void ram_save_cleanup(void *opaque)
>       */
>      memory_global_dirty_log_stop();
>  
> -    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
>          g_free(block->bmap);
>          block->bmap = NULL;
>          g_free(block->unsentmap);
> @@ -1737,7 +1751,7 @@ void ram_postcopy_migrated_memory_release(MigrationState *ms)
>  {
>      struct RAMBlock *block;
>  
> -    RAMBLOCK_FOREACH(block) {
> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
>          unsigned long *bitmap = block->bmap;
>          unsigned long range = block->used_length >> TARGET_PAGE_BITS;
>          unsigned long run_start = find_next_zero_bit(bitmap, range, 0);
> @@ -1815,7 +1829,7 @@ static int postcopy_each_ram_send_discard(MigrationState *ms)
>      struct RAMBlock *block;
>      int ret;
>  
> -    RAMBLOCK_FOREACH(block) {
> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
>          PostcopyDiscardState *pds =
>              postcopy_discard_send_init(ms, block->idstr);
>  
> @@ -2023,7 +2037,7 @@ int ram_postcopy_send_discard_bitmap(MigrationState *ms)
>      rs->last_sent_block = NULL;
>      rs->last_page = 0;
>  
> -    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
>          unsigned long pages = block->used_length >> TARGET_PAGE_BITS;
>          unsigned long *bitmap = block->bmap;
>          unsigned long *unsentmap = block->unsentmap;
> @@ -2182,7 +2196,7 @@ static void ram_list_init_bitmaps(void)
>  
>      /* Skip setting bitmap if there is no RAM */
>      if (ram_bytes_total()) {
> -        QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> +        RAMBLOCK_FOREACH_MIGRATABLE(block) {
>              pages = block->max_length >> TARGET_PAGE_BITS;
>              block->bmap = bitmap_new(pages);
>              bitmap_set(block->bmap, 0, pages);
> @@ -2263,7 +2277,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>  
>      qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE);
>  
> -    RAMBLOCK_FOREACH(block) {
> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
>          qemu_put_byte(f, strlen(block->idstr));
>          qemu_put_buffer(f, (uint8_t *)block->idstr, strlen(block->idstr));
>          qemu_put_be64(f, block->used_length);
> @@ -2507,6 +2521,11 @@ static inline RAMBlock *ram_block_from_stream(QEMUFile *f, int flags)
>          return NULL;
>      }
>  
> +    if (!qemu_ram_is_migratable(block)) {
> +        error_report("block %s should not be migrated !", id);
> +        return NULL;
> +    }
> +
>      return block;
>  }
>  
> @@ -3011,7 +3030,11 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>                  length = qemu_get_be64(f);
>  
>                  block = qemu_ram_block_by_name(id);
> -                if (block) {
> +                if (block && !qemu_ram_is_migratable(block)) {
> +                    error_report("block %s should not be migrated !", id);
> +                    ret = -EINVAL;
> +
> +                } else if (block) {
>                      if (length != block->used_length) {
>                          Error *local_err = NULL;
>  
> diff --git a/migration/savevm.c b/migration/savevm.c
> index e2be02afe42c..9ebfba738ea4 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -2501,11 +2501,13 @@ void vmstate_register_ram(MemoryRegion *mr, DeviceState *dev)
>  {
>      qemu_ram_set_idstr(mr->ram_block,
>                         memory_region_name(mr), dev);
> +    qemu_ram_set_migratable(mr->ram_block);
>  }
>  
>  void vmstate_unregister_ram(MemoryRegion *mr, DeviceState *dev)
>  {
>      qemu_ram_unset_idstr(mr->ram_block);
> +    qemu_ram_unset_migratable(mr->ram_block);
>  }
>  
>  void vmstate_register_ram_global(MemoryRegion *mr)
> -- 
> 2.13.6
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
diff mbox series

Patch

diff --git a/exec.c b/exec.c
index c7fcefa851b2..079c5c8bab7b 100644
--- a/exec.c
+++ b/exec.c
@@ -104,6 +104,9 @@  static MemoryRegion io_mem_unassigned;
  * (Set during postcopy)
  */
 #define RAM_UF_ZEROPAGE (1 << 3)
+
+/* RAM can be migrated */
+#define RAM_MIGRATABLE (1 << 4)
 #endif
 
 #ifdef TARGET_PAGE_BITS_VARY
@@ -1797,6 +1800,21 @@  void qemu_ram_set_uf_zeroable(RAMBlock *rb)
     rb->flags |= RAM_UF_ZEROPAGE;
 }
 
+bool qemu_ram_is_migratable(RAMBlock *rb)
+{
+    return rb->flags & RAM_MIGRATABLE;
+}
+
+void qemu_ram_set_migratable(RAMBlock *rb)
+{
+    rb->flags |= RAM_MIGRATABLE;
+}
+
+void qemu_ram_unset_migratable(RAMBlock *rb)
+{
+    rb->flags &= ~RAM_MIGRATABLE;
+}
+
 /* Called with iothread lock held.  */
 void qemu_ram_set_idstr(RAMBlock *new_block, const char *name, DeviceState *dev)
 {
diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index 24d335f95d45..488288fce959 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -75,6 +75,9 @@  const char *qemu_ram_get_idstr(RAMBlock *rb);
 bool qemu_ram_is_shared(RAMBlock *rb);
 bool qemu_ram_is_uf_zeroable(RAMBlock *rb);
 void qemu_ram_set_uf_zeroable(RAMBlock *rb);
+bool qemu_ram_is_migratable(RAMBlock *rb);
+void qemu_ram_set_migratable(RAMBlock *rb);
+void qemu_ram_unset_migratable(RAMBlock *rb);
 
 size_t qemu_ram_pagesize(RAMBlock *block);
 size_t qemu_ram_pagesize_largest(void);
diff --git a/migration/ram.c b/migration/ram.c
index 912810c18e0f..dfdec78ecb03 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -187,6 +187,11 @@  void ramblock_recv_bitmap_set_range(RAMBlock *rb, void *host_addr,
                       nr);
 }
 
+/* Should be holding either ram_list.mutex, or the RCU lock. */
+#define RAMBLOCK_FOREACH_MIGRATABLE(block)             \
+    RAMBLOCK_FOREACH(block)                            \
+        if (!qemu_ram_is_migratable(block)) {} else
+
 /*
  * An outstanding page request, on the source, having been received
  * and queued
@@ -813,6 +818,10 @@  unsigned long migration_bitmap_find_dirty(RAMState *rs, RAMBlock *rb,
     unsigned long *bitmap = rb->bmap;
     unsigned long next;
 
+    if (!qemu_ram_is_migratable(rb)) {
+        return size;
+    }
+
     if (rs->ram_bulk_stage && start > 0) {
         next = start + 1;
     } else {
@@ -858,7 +867,7 @@  uint64_t ram_pagesize_summary(void)
     RAMBlock *block;
     uint64_t summary = 0;
 
-    RAMBLOCK_FOREACH(block) {
+    RAMBLOCK_FOREACH_MIGRATABLE(block) {
         summary |= block->page_size;
     }
 
@@ -882,7 +891,7 @@  static void migration_bitmap_sync(RAMState *rs)
 
     qemu_mutex_lock(&rs->bitmap_mutex);
     rcu_read_lock();
-    RAMBLOCK_FOREACH(block) {
+    RAMBLOCK_FOREACH_MIGRATABLE(block) {
         migration_bitmap_sync_range(rs, block, 0, block->used_length);
     }
     rcu_read_unlock();
@@ -1521,6 +1530,11 @@  static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss,
     size_t pagesize_bits =
         qemu_ram_pagesize(pss->block) >> TARGET_PAGE_BITS;
 
+    if (!qemu_ram_is_migratable(pss->block)) {
+        error_report("block %s should not be migrated !", pss->block->idstr);
+        return 0;
+    }
+
     do {
         /* Check the pages is dirty and if it is send it */
         if (!migration_bitmap_clear_dirty(rs, pss->block, pss->page)) {
@@ -1619,7 +1633,7 @@  uint64_t ram_bytes_total(void)
     uint64_t total = 0;
 
     rcu_read_lock();
-    RAMBLOCK_FOREACH(block) {
+    RAMBLOCK_FOREACH_MIGRATABLE(block) {
         total += block->used_length;
     }
     rcu_read_unlock();
@@ -1674,7 +1688,7 @@  static void ram_save_cleanup(void *opaque)
      */
     memory_global_dirty_log_stop();
 
-    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
+    RAMBLOCK_FOREACH_MIGRATABLE(block) {
         g_free(block->bmap);
         block->bmap = NULL;
         g_free(block->unsentmap);
@@ -1737,7 +1751,7 @@  void ram_postcopy_migrated_memory_release(MigrationState *ms)
 {
     struct RAMBlock *block;
 
-    RAMBLOCK_FOREACH(block) {
+    RAMBLOCK_FOREACH_MIGRATABLE(block) {
         unsigned long *bitmap = block->bmap;
         unsigned long range = block->used_length >> TARGET_PAGE_BITS;
         unsigned long run_start = find_next_zero_bit(bitmap, range, 0);
@@ -1815,7 +1829,7 @@  static int postcopy_each_ram_send_discard(MigrationState *ms)
     struct RAMBlock *block;
     int ret;
 
-    RAMBLOCK_FOREACH(block) {
+    RAMBLOCK_FOREACH_MIGRATABLE(block) {
         PostcopyDiscardState *pds =
             postcopy_discard_send_init(ms, block->idstr);
 
@@ -2023,7 +2037,7 @@  int ram_postcopy_send_discard_bitmap(MigrationState *ms)
     rs->last_sent_block = NULL;
     rs->last_page = 0;
 
-    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
+    RAMBLOCK_FOREACH_MIGRATABLE(block) {
         unsigned long pages = block->used_length >> TARGET_PAGE_BITS;
         unsigned long *bitmap = block->bmap;
         unsigned long *unsentmap = block->unsentmap;
@@ -2182,7 +2196,7 @@  static void ram_list_init_bitmaps(void)
 
     /* Skip setting bitmap if there is no RAM */
     if (ram_bytes_total()) {
-        QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
+        RAMBLOCK_FOREACH_MIGRATABLE(block) {
             pages = block->max_length >> TARGET_PAGE_BITS;
             block->bmap = bitmap_new(pages);
             bitmap_set(block->bmap, 0, pages);
@@ -2263,7 +2277,7 @@  static int ram_save_setup(QEMUFile *f, void *opaque)
 
     qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE);
 
-    RAMBLOCK_FOREACH(block) {
+    RAMBLOCK_FOREACH_MIGRATABLE(block) {
         qemu_put_byte(f, strlen(block->idstr));
         qemu_put_buffer(f, (uint8_t *)block->idstr, strlen(block->idstr));
         qemu_put_be64(f, block->used_length);
@@ -2507,6 +2521,11 @@  static inline RAMBlock *ram_block_from_stream(QEMUFile *f, int flags)
         return NULL;
     }
 
+    if (!qemu_ram_is_migratable(block)) {
+        error_report("block %s should not be migrated !", id);
+        return NULL;
+    }
+
     return block;
 }
 
@@ -3011,7 +3030,11 @@  static int ram_load(QEMUFile *f, void *opaque, int version_id)
                 length = qemu_get_be64(f);
 
                 block = qemu_ram_block_by_name(id);
-                if (block) {
+                if (block && !qemu_ram_is_migratable(block)) {
+                    error_report("block %s should not be migrated !", id);
+                    ret = -EINVAL;
+
+                } else if (block) {
                     if (length != block->used_length) {
                         Error *local_err = NULL;
 
diff --git a/migration/savevm.c b/migration/savevm.c
index e2be02afe42c..9ebfba738ea4 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2501,11 +2501,13 @@  void vmstate_register_ram(MemoryRegion *mr, DeviceState *dev)
 {
     qemu_ram_set_idstr(mr->ram_block,
                        memory_region_name(mr), dev);
+    qemu_ram_set_migratable(mr->ram_block);
 }
 
 void vmstate_unregister_ram(MemoryRegion *mr, DeviceState *dev)
 {
     qemu_ram_unset_idstr(mr->ram_block);
+    qemu_ram_unset_migratable(mr->ram_block);
 }
 
 void vmstate_register_ram_global(MemoryRegion *mr)