diff mbox series

[v2,06/15] migration: Yield bitmap_mutex properly when sending/sleeping

Message ID 20221011215559.602584-7-peterx@redhat.com
State New
Headers show
Series migration: Postcopy Preempt-Full | expand

Commit Message

Peter Xu Oct. 11, 2022, 9:55 p.m. UTC
Don't take the bitmap mutex when sending pages, or when being throttled by
migration_rate_limit() (which is a bit tricky to call it here in ram code,
but seems still helpful).

It prepares for the possibility of concurrently sending pages in >1 threads
using the function ram_save_host_page() because all threads may need the
bitmap_mutex to operate on bitmaps, so that either sendmsg() or any kind of
qemu_sem_wait() blocking for one thread will not block the other from
progressing.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/ram.c | 41 ++++++++++++++++++++++++++++++-----------
 1 file changed, 30 insertions(+), 11 deletions(-)

Comments

Dr. David Alan Gilbert Oct. 12, 2022, 4:43 p.m. UTC | #1
* Peter Xu (peterx@redhat.com) wrote:
> Don't take the bitmap mutex when sending pages, or when being throttled by
> migration_rate_limit() (which is a bit tricky to call it here in ram code,
> but seems still helpful).
> 
> It prepares for the possibility of concurrently sending pages in >1 threads
> using the function ram_save_host_page() because all threads may need the
> bitmap_mutex to operate on bitmaps, so that either sendmsg() or any kind of
> qemu_sem_wait() blocking for one thread will not block the other from
> progressing.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

although a comment above the reclaration of ram_save_host_pages saying
it can drop the lock would be veyr good.

Dave


> ---
>  migration/ram.c | 41 ++++++++++++++++++++++++++++++-----------
>  1 file changed, 30 insertions(+), 11 deletions(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index b9ac2d6921..578ad8d70a 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2462,6 +2462,7 @@ static void postcopy_preempt_reset_channel(RAMState *rs)
>   */
>  static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss)
>  {
> +    bool page_dirty, preempt_active = postcopy_preempt_active();
>      int tmppages, pages = 0;
>      size_t pagesize_bits =
>          qemu_ram_pagesize(pss->block) >> TARGET_PAGE_BITS;
> @@ -2485,22 +2486,40 @@ static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss)
>              break;
>          }
>  
> -        /* Check the pages is dirty and if it is send it */
> -        if (migration_bitmap_clear_dirty(rs, pss->block, pss->page)) {
> -            tmppages = ram_save_target_page(rs, pss);
> -            if (tmppages < 0) {
> -                return tmppages;
> -            }
> +        page_dirty = migration_bitmap_clear_dirty(rs, pss->block, pss->page);
>  
> -            pages += tmppages;
> +        /* Check the pages is dirty and if it is send it */
> +        if (page_dirty) {
>              /*
> -             * Allow rate limiting to happen in the middle of huge pages if
> -             * something is sent in the current iteration.
> +             * Properly yield the lock only in postcopy preempt mode
> +             * because both migration thread and rp-return thread can
> +             * operate on the bitmaps.
>               */
> -            if (pagesize_bits > 1 && tmppages > 0) {
> -                migration_rate_limit();
> +            if (preempt_active) {
> +                qemu_mutex_unlock(&rs->bitmap_mutex);
> +            }
> +            tmppages = ram_save_target_page(rs, pss);
> +            if (tmppages >= 0) {
> +                pages += tmppages;
> +                /*
> +                 * Allow rate limiting to happen in the middle of huge pages if
> +                 * something is sent in the current iteration.
> +                 */
> +                if (pagesize_bits > 1 && tmppages > 0) {
> +                    migration_rate_limit();
> +                }
>              }
> +            if (preempt_active) {
> +                qemu_mutex_lock(&rs->bitmap_mutex);
> +            }
> +        } else {
> +            tmppages = 0;
>          }
> +
> +        if (tmppages < 0) {
> +            return tmppages;
> +        }
> +
>          pss->page = migration_bitmap_find_dirty(rs, pss->block, pss->page);
>      } while ((pss->page < hostpage_boundary) &&
>               offset_in_ramblock(pss->block,
> -- 
> 2.37.3
>
Peter Xu Oct. 12, 2022, 5:51 p.m. UTC | #2
On Wed, Oct 12, 2022 at 05:43:53PM +0100, Dr. David Alan Gilbert wrote:
> * Peter Xu (peterx@redhat.com) wrote:
> > Don't take the bitmap mutex when sending pages, or when being throttled by
> > migration_rate_limit() (which is a bit tricky to call it here in ram code,
> > but seems still helpful).
> > 
> > It prepares for the possibility of concurrently sending pages in >1 threads
> > using the function ram_save_host_page() because all threads may need the
> > bitmap_mutex to operate on bitmaps, so that either sendmsg() or any kind of
> > qemu_sem_wait() blocking for one thread will not block the other from
> > progressing.
> > 
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> 
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> 
> although a comment above the reclaration of ram_save_host_pages saying
> it can drop the lock would be veyr good.

Let me add that.  Thanks,
Peter Xu Oct. 13, 2022, 4:19 p.m. UTC | #3
On Wed, Oct 12, 2022 at 01:51:07PM -0400, Peter Xu wrote:
> On Wed, Oct 12, 2022 at 05:43:53PM +0100, Dr. David Alan Gilbert wrote:
> > * Peter Xu (peterx@redhat.com) wrote:
> > > Don't take the bitmap mutex when sending pages, or when being throttled by
> > > migration_rate_limit() (which is a bit tricky to call it here in ram code,
> > > but seems still helpful).
> > > 
> > > It prepares for the possibility of concurrently sending pages in >1 threads
> > > using the function ram_save_host_page() because all threads may need the
> > > bitmap_mutex to operate on bitmaps, so that either sendmsg() or any kind of
> > > qemu_sem_wait() blocking for one thread will not block the other from
> > > progressing.
> > > 
> > > Signed-off-by: Peter Xu <peterx@redhat.com>
> > 
> > Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > 
> > although a comment above the reclaration of ram_save_host_pages saying
> > it can drop the lock would be veyr good.
> 
> Let me add that.  Thanks,

A fixup to this patch attached to touch up the comment for
ram_save_host_page().
Dr. David Alan Gilbert Oct. 13, 2022, 4:37 p.m. UTC | #4
* Peter Xu (peterx@redhat.com) wrote:
> On Wed, Oct 12, 2022 at 01:51:07PM -0400, Peter Xu wrote:
> > On Wed, Oct 12, 2022 at 05:43:53PM +0100, Dr. David Alan Gilbert wrote:
> > > * Peter Xu (peterx@redhat.com) wrote:
> > > > Don't take the bitmap mutex when sending pages, or when being throttled by
> > > > migration_rate_limit() (which is a bit tricky to call it here in ram code,
> > > > but seems still helpful).
> > > > 
> > > > It prepares for the possibility of concurrently sending pages in >1 threads
> > > > using the function ram_save_host_page() because all threads may need the
> > > > bitmap_mutex to operate on bitmaps, so that either sendmsg() or any kind of
> > > > qemu_sem_wait() blocking for one thread will not block the other from
> > > > progressing.
> > > > 
> > > > Signed-off-by: Peter Xu <peterx@redhat.com>
> > > 
> > > Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > > 
> > > although a comment above the reclaration of ram_save_host_pages saying
> > > it can drop the lock would be veyr good.
> > 
> > Let me add that.  Thanks,
> 
> A fixup to this patch attached to touch up the comment for
> ram_save_host_page().

Yep, that's right (I don't think we have any formal annotation for
locks)

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> -- 
> Peter Xu

> From dcc3adce062df7216851890d49f7d2b1fa2e84a4 Mon Sep 17 00:00:00 2001
> From: Peter Xu <peterx@redhat.com>
> Date: Thu, 13 Oct 2022 12:18:04 -0400
> Subject: [PATCH] fixup! migration: Yield bitmap_mutex properly when
>  sending/sleeping
> Content-type: text/plain
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  migration/ram.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index 538667b974..b311ece48c 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2407,9 +2407,14 @@ out:
>   * a host page in which case the remainder of the hostpage is sent.
>   * Only dirty target pages are sent. Note that the host page size may
>   * be a huge page for this block.
> + *
>   * The saving stops at the boundary of the used_length of the block
>   * if the RAMBlock isn't a multiple of the host page size.
>   *
> + * The caller must be with ram_state.bitmap_mutex held to call this
> + * function.  Note that this function can temporarily release the lock, but
> + * when the function is returned it'll make sure the lock is still held.
> + *
>   * Returns the number of pages written or negative on error
>   *
>   * @rs: current RAM state
> -- 
> 2.37.3
>
Juan Quintela Nov. 15, 2022, 9:19 a.m. UTC | #5
Peter Xu <peterx@redhat.com> wrote:
> On Wed, Oct 12, 2022 at 01:51:07PM -0400, Peter Xu wrote:
>> On Wed, Oct 12, 2022 at 05:43:53PM +0100, Dr. David Alan Gilbert wrote:
>> > * Peter Xu (peterx@redhat.com) wrote:
>> > > Don't take the bitmap mutex when sending pages, or when being throttled by
>> > > migration_rate_limit() (which is a bit tricky to call it here in ram code,
>> > > but seems still helpful).
>> > > 
>> > > It prepares for the possibility of concurrently sending pages in >1 threads
>> > > using the function ram_save_host_page() because all threads may need the
>> > > bitmap_mutex to operate on bitmaps, so that either sendmsg() or any kind of
>> > > qemu_sem_wait() blocking for one thread will not block the other from
>> > > progressing.
>> > > 
>> > > Signed-off-by: Peter Xu <peterx@redhat.com>
>> > 
>> > Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> > 
>> > although a comment above the reclaration of ram_save_host_pages saying
>> > it can drop the lock would be veyr good.
>> 
>> Let me add that.  Thanks,
>
> A fixup to this patch attached to touch up the comment for
> ram_save_host_page().

Reviewed-by: Juan Quintela <quintela@redhat.com>
diff mbox series

Patch

diff --git a/migration/ram.c b/migration/ram.c
index b9ac2d6921..578ad8d70a 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2462,6 +2462,7 @@  static void postcopy_preempt_reset_channel(RAMState *rs)
  */
 static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss)
 {
+    bool page_dirty, preempt_active = postcopy_preempt_active();
     int tmppages, pages = 0;
     size_t pagesize_bits =
         qemu_ram_pagesize(pss->block) >> TARGET_PAGE_BITS;
@@ -2485,22 +2486,40 @@  static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss)
             break;
         }
 
-        /* Check the pages is dirty and if it is send it */
-        if (migration_bitmap_clear_dirty(rs, pss->block, pss->page)) {
-            tmppages = ram_save_target_page(rs, pss);
-            if (tmppages < 0) {
-                return tmppages;
-            }
+        page_dirty = migration_bitmap_clear_dirty(rs, pss->block, pss->page);
 
-            pages += tmppages;
+        /* Check the pages is dirty and if it is send it */
+        if (page_dirty) {
             /*
-             * Allow rate limiting to happen in the middle of huge pages if
-             * something is sent in the current iteration.
+             * Properly yield the lock only in postcopy preempt mode
+             * because both migration thread and rp-return thread can
+             * operate on the bitmaps.
              */
-            if (pagesize_bits > 1 && tmppages > 0) {
-                migration_rate_limit();
+            if (preempt_active) {
+                qemu_mutex_unlock(&rs->bitmap_mutex);
+            }
+            tmppages = ram_save_target_page(rs, pss);
+            if (tmppages >= 0) {
+                pages += tmppages;
+                /*
+                 * Allow rate limiting to happen in the middle of huge pages if
+                 * something is sent in the current iteration.
+                 */
+                if (pagesize_bits > 1 && tmppages > 0) {
+                    migration_rate_limit();
+                }
             }
+            if (preempt_active) {
+                qemu_mutex_lock(&rs->bitmap_mutex);
+            }
+        } else {
+            tmppages = 0;
         }
+
+        if (tmppages < 0) {
+            return tmppages;
+        }
+
         pss->page = migration_bitmap_find_dirty(rs, pss->block, pss->page);
     } while ((pss->page < hostpage_boundary) &&
              offset_in_ramblock(pss->block,