Message ID | 20220510224220.5912-3-quintela@redhat.com |
---|---|
State | New |
Headers | show |
Series | Migration: Transmit and detect zero pages in the multifd threads | expand |
* Juan Quintela (quintela@redhat.com) wrote: > We were calling qemu_target_page_size() left and right. > > Signed-off-by: Juan Quintela <quintela@redhat.com> (Copying in Peter Maydell) Your problem here is most of these files are target independent so you end up calling the qemu_target_page_size functions, which I guess you're seeing popup in some perf trace? I mean they're trivial functions but I guess you do get the function call. I wonder about the following patch instead (Note i've removed the const on the structure here); I wonder how this does performance wise for everyone: From abc7da46736b18b6138868ccc0b11901169e1dfd Mon Sep 17 00:00:00 2001 From: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Date: Mon, 16 May 2022 19:54:31 +0100 Subject: [PATCH] target-page: Maintain target_page variable even for non-variable Content-type: text/plain On architectures that define TARGET_PAGE_BITS_VARY, the 'target_page' structure gets filled in at run time by the number of bits and the TARGET_PAGE_BITS and TARGET_PAGE macros use that rather than being constant. On non-variable pagesize systems target_page is not filled in, and we rely on TARGET_PAGE_SIZE being compile time defined. The problem is that for source files that are target-independent they end up calling qemu_target_page_size to read the size, and that function call is annoying. Improve this by always filling in 'target_page' even for non-variable size CPUs, and inlining the functions that previously returned the macro values (that may have been constant) to return the values read from target_page. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> --- include/exec/cpu-all.h | 4 ++-- include/exec/page-vary.h | 2 ++ include/exec/target_page.h | 11 +++++++++-- page-vary.c | 2 -- softmmu/physmem.c | 10 ---------- 5 files changed, 13 insertions(+), 16 deletions(-) diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h index 5d5290deb5..6a498fa033 100644 --- a/include/exec/cpu-all.h +++ b/include/exec/cpu-all.h @@ -214,9 +214,9 @@ static inline void stl_phys_notdirty(AddressSpace *as, hwaddr addr, uint32_t val /* page related stuff */ +#include "exec/page-vary.h" + #ifdef TARGET_PAGE_BITS_VARY -# include "exec/page-vary.h" -extern const TargetPageBits target_page; #ifdef CONFIG_DEBUG_TCG #define TARGET_PAGE_BITS ({ assert(target_page.decided); target_page.bits; }) #define TARGET_PAGE_MASK ({ assert(target_page.decided); \ diff --git a/include/exec/page-vary.h b/include/exec/page-vary.h index ebbe9b169b..31cb9dd9dd 100644 --- a/include/exec/page-vary.h +++ b/include/exec/page-vary.h @@ -26,6 +26,8 @@ typedef struct { uint64_t mask; } TargetPageBits; +extern TargetPageBits target_page; + #ifdef IN_PAGE_VARY extern bool set_preferred_target_page_bits_common(int bits); extern void finalize_target_page_bits_common(int min); diff --git a/include/exec/target_page.h b/include/exec/target_page.h index 96726c36a4..e718b145b3 100644 --- a/include/exec/target_page.h +++ b/include/exec/target_page.h @@ -13,9 +13,16 @@ #ifndef EXEC_TARGET_PAGE_H #define EXEC_TARGET_PAGE_H +#include "exec/page-vary.h" + +inline int qemu_target_page_bits(void) { + return target_page.bits; +} + +inline size_t qemu_target_page_size(void) { + return 1 << target_page.bits; +} -size_t qemu_target_page_size(void); -int qemu_target_page_bits(void); int qemu_target_page_bits_min(void); #endif diff --git a/page-vary.c b/page-vary.c index 343b4adb95..3f81144cda 100644 --- a/page-vary.c +++ b/page-vary.c @@ -35,7 +35,5 @@ bool set_preferred_target_page_bits(int bits) void finalize_target_page_bits(void) { -#ifdef TARGET_PAGE_BITS_VARY finalize_target_page_bits_common(TARGET_PAGE_BITS_MIN); -#endif } diff --git a/softmmu/physmem.c b/softmmu/physmem.c index 657841eed0..2117476081 100644 --- a/softmmu/physmem.c +++ b/softmmu/physmem.c @@ -3515,16 +3515,6 @@ int cpu_memory_rw_debug(CPUState *cpu, vaddr addr, * Allows code that needs to deal with migration bitmaps etc to still be built * target independent. */ -size_t qemu_target_page_size(void) -{ - return TARGET_PAGE_SIZE; -} - -int qemu_target_page_bits(void) -{ - return TARGET_PAGE_BITS; -} - int qemu_target_page_bits_min(void) { return TARGET_PAGE_BITS_MIN;
"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote: > * Juan Quintela (quintela@redhat.com) wrote: >> We were calling qemu_target_page_size() left and right. >> >> Signed-off-by: Juan Quintela <quintela@redhat.com> [Adding Richard] > (Copying in Peter Maydell) > Your problem here is most of these files are target independent > so you end up calling the qemu_target_page_size functions, which I guess > you're seeing popup in some perf trace? > I mean they're trivial functions but I guess you do get the function > call. Hi There are several problems here: - Richard complained in previous reviews that we were calling qemu_target_page_size() inside loops or more than once per function (He was right) - qemu_target_page_size() name is so long that basically means that I had to split the line for each appearance. - All migration code assumes that the value is constant for a current migration, it can change. So I decided to cache the value in the structure and call it a day. The same for the other page_count field. I have never seen that function on a performance profile, so this is just a taste/aesthetic issue. I think your patch is still good, but it don't cover any of the issues that I just listed. Thanks, Juan. > > I wonder about the following patch instead > (Note i've removed the const on the structure here); I wonder how this > does performance wise for everyone: > > > From abc7da46736b18b6138868ccc0b11901169e1dfd Mon Sep 17 00:00:00 2001 > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com> > Date: Mon, 16 May 2022 19:54:31 +0100 > Subject: [PATCH] target-page: Maintain target_page variable even for > non-variable > Content-type: text/plain > > On architectures that define TARGET_PAGE_BITS_VARY, the 'target_page' > structure gets filled in at run time by the number of bits and the > TARGET_PAGE_BITS and TARGET_PAGE macros use that rather than being > constant. > > On non-variable pagesize systems target_page is not filled in, and we > rely on TARGET_PAGE_SIZE being compile time defined. > > The problem is that for source files that are target-independent > they end up calling qemu_target_page_size to read the size, and that > function call is annoying. > > Improve this by always filling in 'target_page' even for non-variable > size CPUs, and inlining the functions that previously returned > the macro values (that may have been constant) to return the > values read from target_page. > > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
diff --git a/migration/multifd.h b/migration/multifd.h index f1f88c6737..4de80d9e53 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -80,6 +80,8 @@ typedef struct { bool registered_yank; /* packet allocated len */ uint32_t packet_len; + /* guest page size */ + uint32_t page_size; /* sem where to wait for more work */ QemuSemaphore sem; @@ -141,6 +143,8 @@ typedef struct { QIOChannel *c; /* packet allocated len */ uint32_t packet_len; + /* guest page size */ + uint32_t page_size; /* syncs main thread and channels */ QemuSemaphore sem_sync; diff --git a/migration/multifd-zlib.c b/migration/multifd-zlib.c index 3a7ae44485..28349ff2e0 100644 --- a/migration/multifd-zlib.c +++ b/migration/multifd-zlib.c @@ -100,7 +100,6 @@ static void zlib_send_cleanup(MultiFDSendParams *p, Error **errp) static int zlib_send_prepare(MultiFDSendParams *p, Error **errp) { struct zlib_data *z = p->data; - size_t page_size = qemu_target_page_size(); z_stream *zs = &z->zs; uint32_t out_size = 0; int ret; @@ -114,7 +113,7 @@ static int zlib_send_prepare(MultiFDSendParams *p, Error **errp) flush = Z_SYNC_FLUSH; } - zs->avail_in = page_size; + zs->avail_in = p->page_size; zs->next_in = p->pages->block->host + p->normal[i]; zs->avail_out = available; @@ -220,12 +219,11 @@ static void zlib_recv_cleanup(MultiFDRecvParams *p) static int zlib_recv_pages(MultiFDRecvParams *p, Error **errp) { struct zlib_data *z = p->data; - size_t page_size = qemu_target_page_size(); z_stream *zs = &z->zs; uint32_t in_size = p->next_packet_size; /* we measure the change of total_out */ uint32_t out_size = zs->total_out; - uint32_t expected_size = p->normal_num * page_size; + uint32_t expected_size = p->normal_num * p->page_size; uint32_t flags = p->flags & MULTIFD_FLAG_COMPRESSION_MASK; int ret; int i; @@ -252,7 +250,7 @@ static int zlib_recv_pages(MultiFDRecvParams *p, Error **errp) flush = Z_SYNC_FLUSH; } - zs->avail_out = page_size; + zs->avail_out = p->page_size; zs->next_out = p->host + p->normal[i]; /* @@ -266,8 +264,8 @@ static int zlib_recv_pages(MultiFDRecvParams *p, Error **errp) do { ret = inflate(zs, flush); } while (ret == Z_OK && zs->avail_in - && (zs->total_out - start) < page_size); - if (ret == Z_OK && (zs->total_out - start) < page_size) { + && (zs->total_out - start) < p->page_size); + if (ret == Z_OK && (zs->total_out - start) < p->page_size) { error_setg(errp, "multifd %u: inflate generated too few output", p->id); return -1; diff --git a/migration/multifd-zstd.c b/migration/multifd-zstd.c index d788d309f2..f4a8e1ed1f 100644 --- a/migration/multifd-zstd.c +++ b/migration/multifd-zstd.c @@ -113,7 +113,6 @@ static void zstd_send_cleanup(MultiFDSendParams *p, Error **errp) static int zstd_send_prepare(MultiFDSendParams *p, Error **errp) { struct zstd_data *z = p->data; - size_t page_size = qemu_target_page_size(); int ret; uint32_t i; @@ -128,7 +127,7 @@ static int zstd_send_prepare(MultiFDSendParams *p, Error **errp) flush = ZSTD_e_flush; } z->in.src = p->pages->block->host + p->normal[i]; - z->in.size = page_size; + z->in.size = p->page_size; z->in.pos = 0; /* @@ -241,8 +240,7 @@ static int zstd_recv_pages(MultiFDRecvParams *p, Error **errp) { uint32_t in_size = p->next_packet_size; uint32_t out_size = 0; - size_t page_size = qemu_target_page_size(); - uint32_t expected_size = p->normal_num * page_size; + uint32_t expected_size = p->normal_num * p->page_size; uint32_t flags = p->flags & MULTIFD_FLAG_COMPRESSION_MASK; struct zstd_data *z = p->data; int ret; @@ -265,7 +263,7 @@ static int zstd_recv_pages(MultiFDRecvParams *p, Error **errp) for (i = 0; i < p->normal_num; i++) { z->out.dst = p->host + p->normal[i]; - z->out.size = page_size; + z->out.size = p->page_size; z->out.pos = 0; /* @@ -279,8 +277,8 @@ static int zstd_recv_pages(MultiFDRecvParams *p, Error **errp) do { ret = ZSTD_decompressStream(z->zds, &z->out, &z->in); } while (ret > 0 && (z->in.size - z->in.pos > 0) - && (z->out.pos < page_size)); - if (ret > 0 && (z->out.pos < page_size)) { + && (z->out.pos < p->page_size)); + if (ret > 0 && (z->out.pos < p->page_size)) { error_setg(errp, "multifd %u: decompressStream buffer too small", p->id); return -1; diff --git a/migration/multifd.c b/migration/multifd.c index 9ea4f581e2..f15fed5f1f 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -87,15 +87,14 @@ static void nocomp_send_cleanup(MultiFDSendParams *p, Error **errp) static int nocomp_send_prepare(MultiFDSendParams *p, Error **errp) { MultiFDPages_t *pages = p->pages; - size_t page_size = qemu_target_page_size(); for (int i = 0; i < p->normal_num; i++) { p->iov[p->iovs_num].iov_base = pages->block->host + p->normal[i]; - p->iov[p->iovs_num].iov_len = page_size; + p->iov[p->iovs_num].iov_len = p->page_size; p->iovs_num++; } - p->next_packet_size = p->normal_num * page_size; + p->next_packet_size = p->normal_num * p->page_size; p->flags |= MULTIFD_FLAG_NOCOMP; return 0; } @@ -139,7 +138,6 @@ static void nocomp_recv_cleanup(MultiFDRecvParams *p) static int nocomp_recv_pages(MultiFDRecvParams *p, Error **errp) { uint32_t flags = p->flags & MULTIFD_FLAG_COMPRESSION_MASK; - size_t page_size = qemu_target_page_size(); if (flags != MULTIFD_FLAG_NOCOMP) { error_setg(errp, "multifd %u: flags received %x flags expected %x", @@ -148,7 +146,7 @@ static int nocomp_recv_pages(MultiFDRecvParams *p, Error **errp) } for (int i = 0; i < p->normal_num; i++) { p->iov[i].iov_base = p->host + p->normal[i]; - p->iov[i].iov_len = page_size; + p->iov[i].iov_len = p->page_size; } return qio_channel_readv_all(p->c, p->iov, p->normal_num, errp); } @@ -281,8 +279,7 @@ static void multifd_send_fill_packet(MultiFDSendParams *p) static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) { MultiFDPacket_t *packet = p->packet; - size_t page_size = qemu_target_page_size(); - uint32_t page_count = MULTIFD_PACKET_SIZE / page_size; + uint32_t page_count = MULTIFD_PACKET_SIZE / p->page_size; RAMBlock *block; int i; @@ -344,7 +341,7 @@ static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) for (i = 0; i < p->normal_num; i++) { uint64_t offset = be64_to_cpu(packet->offset[i]); - if (offset > (block->used_length - page_size)) { + if (offset > (block->used_length - p->page_size)) { error_setg(errp, "multifd: offset too long %" PRIu64 " (max " RAM_ADDR_FMT ")", offset, block->used_length); @@ -433,8 +430,7 @@ static int multifd_send_pages(QEMUFile *f) p->packet_num = multifd_send_state->packet_num++; multifd_send_state->pages = p->pages; p->pages = pages; - transferred = ((uint64_t) pages->num) * qemu_target_page_size() - + p->packet_len; + transferred = ((uint64_t) pages->num) * p->page_size + p->packet_len; qemu_file_update_transfer(f, transferred); ram_counters.multifd_bytes += transferred; ram_counters.transferred += transferred; @@ -898,6 +894,7 @@ int multifd_save_setup(Error **errp) /* We need one extra place for the packet header */ p->iov = g_new0(struct iovec, page_count + 1); p->normal = g_new0(ram_addr_t, page_count); + p->page_size = qemu_target_page_size(); socket_send_channel_create(multifd_new_send_channel_async, p); } @@ -1138,6 +1135,7 @@ int multifd_load_setup(Error **errp) p->name = g_strdup_printf("multifdrecv_%d", i); p->iov = g_new0(struct iovec, page_count); p->normal = g_new0(ram_addr_t, page_count); + p->page_size = qemu_target_page_size(); } for (i = 0; i < thread_count; i++) {
We were calling qemu_target_page_size() left and right. Signed-off-by: Juan Quintela <quintela@redhat.com> --- migration/multifd.h | 4 ++++ migration/multifd-zlib.c | 12 +++++------- migration/multifd-zstd.c | 12 +++++------- migration/multifd.c | 18 ++++++++---------- 4 files changed, 22 insertions(+), 24 deletions(-)