diff mbox series

[2/2] migration: Calculate ram size once

Message ID 20220510231708.7197-3-quintela@redhat.com
State New
Headers show
Series migration: Store ram size value | expand

Commit Message

Juan Quintela May 10, 2022, 11:17 p.m. UTC
We are recalculating ram size continously, when we know that it don't
change during migration.  Create a field in RAMState to track it.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/ram.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

Comments

David Hildenbrand May 11, 2022, 9:22 a.m. UTC | #1
On 11.05.22 01:17, Juan Quintela wrote:
> We are recalculating ram size continously, when we know that it don't
> change during migration.  Create a field in RAMState to track it.

We do have resizable RAM, which triggers ram_mig_ram_block_resized() on
resizes when changing block->used_length.

ram_mig_ram_block_resized() aborts migration when we detect a resize on
the source. I assume that is what you care about here, right?
Juan Quintela May 16, 2022, 10:04 a.m. UTC | #2
David Hildenbrand <david@redhat.com> wrote:
> On 11.05.22 01:17, Juan Quintela wrote:
>> We are recalculating ram size continously, when we know that it don't
>> change during migration.  Create a field in RAMState to track it.

Hi

> We do have resizable RAM, which triggers ram_mig_ram_block_resized() on
> resizes when changing block->used_length.

Yeap.

The problem is this bit of the patch:

@@ -2259,7 +2261,7 @@ static int ram_find_and_save_block(RAMState *rs)
     bool again, found;
 
     /* No dirty page as there is zero RAM */
-    if (!ram_bytes_total()) {
+    if (!rs->ram_bytes_total) {
         return pages;
     }
 
On 1TB guest, we moved form

75.8 seconds to 70.0 seconds total time just with this change.  This is
a idle guest full of zero pages.  If thue guest is busier, the effect is
less noticiable because we spend more time sending pages.

This effect is even more noticeable if you put this series on top of my
zero page detection series that I sent before.  With that, we end having
ram_bytes_total_common() as the second function that uses more CPU.

As you can see for the perf profile that I sent, with upstream we get:


  10.42%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
   3.71%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common

(I have deleted the functions that are not reletade to this code path.
ram_find_and_save_block() is the function that calls
ram_bytes_total_common()).

After the patch, we move to:

  11.32%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0

And ram_bytes_total_common() don't even exist.  Notice that my series
only remove one call to it.  The CPU utilization here is higher for
ram_find_and_save_block() because:

- Meassuring things is difficult (TM)
- With this patch, migration goes faster, and then we call
  ram_find_and_save_block() more times in the same period of time.

> ram_mig_ram_block_resized() aborts migration when we detect a resize on
> the source. I assume that is what you care about here, right?

Ouch.  Hadn't noticed that we could change the ramsize during
migration.  Anyways, as you said, if ram size change, we cancel
migration, so we can still cache the size value. Actually, I have just
put it on ram_state_init(), because there is where we setup the bitmaps,
so if we change the bitmap, we change also the ram_size cached value.

Thanks, Juan.
diff mbox series

Patch

diff --git a/migration/ram.c b/migration/ram.c
index b3fa3d5d8f..5d415834e5 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -301,6 +301,8 @@  struct RAMState {
     QEMUFile *f;
     /* UFFD file descriptor, used in 'write-tracking' migration */
     int uffdio_fd;
+    /* total ram size in bytes */
+    uint64_t ram_bytes_total;
     /* Last block that we have visited searching for dirty pages */
     RAMBlock *last_seen_block;
     /* Last block from where we have sent data */
@@ -2259,7 +2261,7 @@  static int ram_find_and_save_block(RAMState *rs)
     bool again, found;
 
     /* No dirty page as there is zero RAM */
-    if (!ram_bytes_total()) {
+    if (!rs->ram_bytes_total) {
         return pages;
     }
 
@@ -2707,13 +2709,14 @@  static int ram_state_init(RAMState **rsp)
     qemu_mutex_init(&(*rsp)->bitmap_mutex);
     qemu_mutex_init(&(*rsp)->src_page_req_mutex);
     QSIMPLEQ_INIT(&(*rsp)->src_page_requests);
+    (*rsp)->ram_bytes_total = ram_bytes_total();
 
     /*
      * Count the total number of pages used by ram blocks not including any
      * gaps due to alignment or unplugs.
      * This must match with the initial values of dirty bitmap.
      */
-    (*rsp)->migration_dirty_pages = ram_bytes_total() >> TARGET_PAGE_BITS;
+    (*rsp)->migration_dirty_pages = (*rsp)->ram_bytes_total >> TARGET_PAGE_BITS;
     ram_state_reset(*rsp);
 
     return 0;