diff mbox series

migration: Rate limit inside host pages

Message ID 20191205102918.63294-1-dgilbert@redhat.com
State New
Headers show
Series migration: Rate limit inside host pages | expand

Commit Message

Dr. David Alan Gilbert Dec. 5, 2019, 10:29 a.m. UTC
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

When using hugepages, rate limiting is necessary within each huge
page, since a 1G huge page can take a significant time to send, so
you end up with bursty behaviour.

Fixes: 4c011c37ecb3 ("postcopy: Send whole huge pages")
Reported-by: Lin Ma <LMa@suse.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/migration.c  | 57 ++++++++++++++++++++++++------------------
 migration/migration.h  |  1 +
 migration/ram.c        |  2 ++
 migration/trace-events |  4 +--
 4 files changed, 37 insertions(+), 27 deletions(-)

Comments

Juan Quintela Dec. 5, 2019, 1:54 p.m. UTC | #1
"Dr. David Alan Gilbert (git)" <dgilbert@redhat.com> wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>
> When using hugepages, rate limiting is necessary within each huge
> page, since a 1G huge page can take a significant time to send, so
> you end up with bursty behaviour.
>
> Fixes: 4c011c37ecb3 ("postcopy: Send whole huge pages")
> Reported-by: Lin Ma <LMa@suse.com>
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---

Reviewed-by: Juan Quintela <quintela@redhat.com>

I can agree that rate limit needs to be done for huge pages.

> diff --git a/migration/ram.c b/migration/ram.c
> index a4ae3b3120..a9177c6a24 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2616,6 +2616,8 @@ static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss,
>  
>          pages += tmppages;
>          pss->page++;
> +        /* Allow rate limiting to happen in the middle of huge pages */
> +        migration_rate_limit();
>      } while ((pss->page & (pagesize_bits - 1)) &&
>               offset_in_ramblock(pss->block, pss->page << TARGET_PAGE_BITS));
>  

But is doing the rate limit for each page, no?  Even when not using huge
pages.

Not that it should be a big issue (performance wise).
Have you done any meassuremnet?


Later, Juan.
Peter Xu Dec. 5, 2019, 1:55 p.m. UTC | #2
On Thu, Dec 05, 2019 at 10:29:18AM +0000, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 
> When using hugepages, rate limiting is necessary within each huge
> page, since a 1G huge page can take a significant time to send, so
> you end up with bursty behaviour.
> 
> Fixes: 4c011c37ecb3 ("postcopy: Send whole huge pages")
> Reported-by: Lin Ma <LMa@suse.com>
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
>  migration/migration.c  | 57 ++++++++++++++++++++++++------------------
>  migration/migration.h  |  1 +
>  migration/ram.c        |  2 ++
>  migration/trace-events |  4 +--
>  4 files changed, 37 insertions(+), 27 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 354ad072fa..27500d09a9 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -3224,6 +3224,37 @@ void migration_consume_urgent_request(void)
>      qemu_sem_wait(&migrate_get_current()->rate_limit_sem);
>  }
>  
> +/* Returns true if the rate limiting was broken by an urgent request */
> +bool migration_rate_limit(void)
> +{
> +    int64_t now = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> +    MigrationState *s = migrate_get_current();
> +
> +    bool urgent = false;
> +    migration_update_counters(s, now);
> +    if (qemu_file_rate_limit(s->to_dst_file)) {
> +        /*
> +         * Wait for a delay to do rate limiting OR
> +         * something urgent to post the semaphore.
> +         */
> +        int ms = s->iteration_start_time + BUFFER_DELAY - now;
> +        trace_migration_rate_limit_pre(ms);
> +        if (qemu_sem_timedwait(&s->rate_limit_sem, ms) == 0) {
> +            /*
> +             * We were woken by one or more urgent things but
> +             * the timedwait will have consumed one of them.
> +             * The service routine for the urgent wake will dec
> +             * the semaphore itself for each item it consumes,
> +             * so add this one we just eat back.
> +             */
> +            qemu_sem_post(&s->rate_limit_sem);

I remembered I've commented around this when it was first introduced
on whether we can avoid this post().  IMHO we can if with something
like an eventfd, so when we queue the page we write the eventfd to 1,
here we poll() on the eventfd with the same timeout, then clear it
after the poll no matter what.  When unqueue, we can probably simply
do nothing.  I'm not sure about Windows or other OS, though..

Anyway this patch is not changing that part but to fix huge page
issue, so that's another story for sure.

Reviewed-by: Peter Xu <peterx@redhat.com>

Thanks,
Dr. David Alan Gilbert Dec. 5, 2019, 2:30 p.m. UTC | #3
* Juan Quintela (quintela@redhat.com) wrote:
> "Dr. David Alan Gilbert (git)" <dgilbert@redhat.com> wrote:
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> >
> > When using hugepages, rate limiting is necessary within each huge
> > page, since a 1G huge page can take a significant time to send, so
> > you end up with bursty behaviour.
> >
> > Fixes: 4c011c37ecb3 ("postcopy: Send whole huge pages")
> > Reported-by: Lin Ma <LMa@suse.com>
> > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > ---
> 
> Reviewed-by: Juan Quintela <quintela@redhat.com>
> 
> I can agree that rate limit needs to be done for huge pages.
> 
> > diff --git a/migration/ram.c b/migration/ram.c
> > index a4ae3b3120..a9177c6a24 100644
> > --- a/migration/ram.c
> > +++ b/migration/ram.c
> > @@ -2616,6 +2616,8 @@ static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss,
> >  
> >          pages += tmppages;
> >          pss->page++;
> > +        /* Allow rate limiting to happen in the middle of huge pages */
> > +        migration_rate_limit();
> >      } while ((pss->page & (pagesize_bits - 1)) &&
> >               offset_in_ramblock(pss->block, pss->page << TARGET_PAGE_BITS));
> >  
> 
> But is doing the rate limit for each page, no?  Even when not using huge
> pages.

Right.

> Not that it should be a big issue (performance wise).
> Have you done any meassuremnet?

I've just given it a quick run; it still seems to be hitting ~9.5Gbps on
my 10Gbps interface; so it doesn't seem to be the limit on that.

Dave

> 
> 
> Later, Juan.
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
diff mbox series

Patch

diff --git a/migration/migration.c b/migration/migration.c
index 354ad072fa..27500d09a9 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3224,6 +3224,37 @@  void migration_consume_urgent_request(void)
     qemu_sem_wait(&migrate_get_current()->rate_limit_sem);
 }
 
+/* Returns true if the rate limiting was broken by an urgent request */
+bool migration_rate_limit(void)
+{
+    int64_t now = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
+    MigrationState *s = migrate_get_current();
+
+    bool urgent = false;
+    migration_update_counters(s, now);
+    if (qemu_file_rate_limit(s->to_dst_file)) {
+        /*
+         * Wait for a delay to do rate limiting OR
+         * something urgent to post the semaphore.
+         */
+        int ms = s->iteration_start_time + BUFFER_DELAY - now;
+        trace_migration_rate_limit_pre(ms);
+        if (qemu_sem_timedwait(&s->rate_limit_sem, ms) == 0) {
+            /*
+             * We were woken by one or more urgent things but
+             * the timedwait will have consumed one of them.
+             * The service routine for the urgent wake will dec
+             * the semaphore itself for each item it consumes,
+             * so add this one we just eat back.
+             */
+            qemu_sem_post(&s->rate_limit_sem);
+            urgent = true;
+        }
+        trace_migration_rate_limit_post(urgent);
+    }
+    return urgent;
+}
+
 /*
  * Master migration thread on the source VM.
  * It drives the migration and pumps the data down the outgoing channel.
@@ -3290,8 +3321,6 @@  static void *migration_thread(void *opaque)
     trace_migration_thread_setup_complete();
 
     while (migration_is_active(s)) {
-        int64_t current_time;
-
         if (urgent || !qemu_file_rate_limit(s->to_dst_file)) {
             MigIterateState iter_state = migration_iteration_run(s);
             if (iter_state == MIG_ITERATE_SKIP) {
@@ -3318,29 +3347,7 @@  static void *migration_thread(void *opaque)
             update_iteration_initial_status(s);
         }
 
-        current_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
-
-        migration_update_counters(s, current_time);
-
-        urgent = false;
-        if (qemu_file_rate_limit(s->to_dst_file)) {
-            /* Wait for a delay to do rate limiting OR
-             * something urgent to post the semaphore.
-             */
-            int ms = s->iteration_start_time + BUFFER_DELAY - current_time;
-            trace_migration_thread_ratelimit_pre(ms);
-            if (qemu_sem_timedwait(&s->rate_limit_sem, ms) == 0) {
-                /* We were worken by one or more urgent things but
-                 * the timedwait will have consumed one of them.
-                 * The service routine for the urgent wake will dec
-                 * the semaphore itself for each item it consumes,
-                 * so add this one we just eat back.
-                 */
-                qemu_sem_post(&s->rate_limit_sem);
-                urgent = true;
-            }
-            trace_migration_thread_ratelimit_post(urgent);
-        }
+        urgent = migration_rate_limit();
     }
 
     trace_migration_thread_after_loop();
diff --git a/migration/migration.h b/migration/migration.h
index 79b3dda146..aa9ff6f27b 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -341,5 +341,6 @@  int foreach_not_ignored_block(RAMBlockIterFunc func, void *opaque);
 
 void migration_make_urgent_request(void);
 void migration_consume_urgent_request(void);
+bool migration_rate_limit(void);
 
 #endif
diff --git a/migration/ram.c b/migration/ram.c
index a4ae3b3120..a9177c6a24 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2616,6 +2616,8 @@  static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss,
 
         pages += tmppages;
         pss->page++;
+        /* Allow rate limiting to happen in the middle of huge pages */
+        migration_rate_limit();
     } while ((pss->page & (pagesize_bits - 1)) &&
              offset_in_ramblock(pss->block, pss->page << TARGET_PAGE_BITS));
 
diff --git a/migration/trace-events b/migration/trace-events
index 6dee7b5389..2f9129e213 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -138,12 +138,12 @@  migrate_send_rp_recv_bitmap(char *name, int64_t size) "block '%s' size 0x%"PRIi6
 migration_completion_file_err(void) ""
 migration_completion_postcopy_end(void) ""
 migration_completion_postcopy_end_after_complete(void) ""
+migration_rate_limit_pre(int ms) "%d ms"
+migration_rate_limit_post(int urgent) "urgent: %d"
 migration_return_path_end_before(void) ""
 migration_return_path_end_after(int rp_error) "%d"
 migration_thread_after_loop(void) ""
 migration_thread_file_err(void) ""
-migration_thread_ratelimit_pre(int ms) "%d ms"
-migration_thread_ratelimit_post(int urgent) "urgent: %d"
 migration_thread_setup_complete(void) ""
 open_return_path_on_source(void) ""
 open_return_path_on_source_continue(void) ""