[0/6] Introduce multifd zero page checking.

Message ID	20240206231908.1792529-1-hao.xiang@bytedance.com
Headers	show Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org> From: Hao Xiang <hao.xiang@bytedance.com> To: qemu-devel@nongnu.org, farosas@suse.de, peterx@redhat.com Cc: Hao Xiang <hao.xiang@bytedance.com> Subject: [PATCH 0/6] Introduce multifd zero page checking. Date: Tue, 6 Feb 2024 23:19:02 +0000 Message-Id: <20240206231908.1792529-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=2607:f8b0:4864:20::934; envelope-from=hao.xiang@bytedance.com; helo=mail-ua1-x934.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action Precedence: list Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org
Series	Introduce multifd zero page checking. \| expand [0/6] Introduce multifd zero page checking. [1/6] migration/multifd: Add new migration option multifd-zero-page. [2/6] migration/multifd: Add zero pages and zero bytes counter to migration status interface. [3/6] migration/multifd: Support for zero pages transmission in multifd format. [4/6] migration/multifd: Zero page transmission on the multifd thread. [5/6] migration/multifd: Enable zero page checking from multifd threads. [6/6] migration/multifd: Add a new migration test case for legacy zero page checking.

Message ID

20240206231908.1792529-1-hao.xiang@bytedance.com

Headers

From: Hao Xiang <hao.xiang@bytedance.com>
To: qemu-devel@nongnu.org,
	farosas@suse.de,
	peterx@redhat.com
Cc: Hao Xiang <hao.xiang@bytedance.com>
Subject: [PATCH 0/6] Introduce multifd zero page checking.
Date: Tue,  6 Feb 2024 23:19:02 +0000
Message-Id: <20240206231908.1792529-1-hao.xiang@bytedance.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Received-SPF: pass client-ip=2607:f8b0:4864:20::934;
 envelope-from=hao.xiang@bytedance.com; helo=mail-ua1-x934.google.com
X-Spam_score_int: -20
X-Spam_score: -2.1
X-Spam_bar: --
X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1,
 RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001,
 T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <https://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org
Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org

Series

Introduce multifd zero page checking. | expand

Message

Hao Xiang Feb. 6, 2024, 11:19 p.m. UTC

This patchset is based on Juan Quintela's old series here
https://lore.kernel.org/all/20220802063907.18882-1-quintela@redhat.com/

In the multifd live migration model, there is a single migration main
thread scanning the page map, queuing the pages to multiple multifd
sender threads. The migration main thread runs zero page checking on
every page before queuing the page to the sender threads. Zero page
checking is a CPU intensive task and hence having a single thread doing
all that doesn't scale well. This change introduces a new function
to run the zero page checking on the multifd sender threads. This
patchset also lays the ground work for future changes to offload zero
page checking task to accelerator hardwares.

Use two Intel 4th generation Xeon servers for testing.

Architecture:        x86_64
CPU(s):              192
Thread(s) per core:  2
Core(s) per socket:  48
Socket(s):           2
NUMA node(s):        2
Vendor ID:           GenuineIntel
CPU family:          6
Model:               143
Model name:          Intel(R) Xeon(R) Platinum 8457C
Stepping:            8
CPU MHz:             2538.624
CPU max MHz:         3800.0000
CPU min MHz:         800.0000

Perform multifd live migration with below setup:
1. VM has 100GB memory. All pages in the VM are zero pages.
2. Use tcp socket for live migratio.
3. Use 4 multifd channels and zero page checking on migration main thread.
4. Use 1/2/4 multifd channels and zero page checking on multifd sender
threads.
5. Record migration total time from sender QEMU console's "info migrate"
command.
6. Calculate throughput with "100GB / total time".

+------------------------------------------------------+
|zero-page-checking | total-time(ms) | throughput(GB/s)|
+------------------------------------------------------+
|main-thread        | 9629           | 10.38GB/s       |
+------------------------------------------------------+
|multifd-1-threads  | 6182           | 16.17GB/s       |
+------------------------------------------------------+
|multifd-2-threads  | 4643           | 21.53GB/s       |
+------------------------------------------------------+
|multifd-4-threads  | 4143           | 24.13GB/s       |
+------------------------------------------------------+

Apply this patchset on top of commit
39a6e4f87e7b75a45b08d6dc8b8b7c2954c87440

Hao Xiang (6):
  migration/multifd: Add new migration option multifd-zero-page.
  migration/multifd: Add zero pages and zero bytes counter to migration
    status interface.
  migration/multifd: Support for zero pages transmission in multifd
    format.
  migration/multifd: Zero page transmission on the multifd thread.
  migration/multifd: Enable zero page checking from multifd threads.
  migration/multifd: Add a new migration test case for legacy zero page
    checking.

 migration/migration-hmp-cmds.c |  11 ++++
 migration/multifd.c            | 106 ++++++++++++++++++++++++++++-----
 migration/multifd.h            |  22 ++++++-
 migration/options.c            |  20 +++++++
 migration/options.h            |   1 +
 migration/ram.c                |  49 ++++++++++++---
 migration/trace-events         |   8 +--
 qapi/migration.json            |  39 ++++++++++--
 tests/qtest/migration-test.c   |  26 ++++++++
 9 files changed, 249 insertions(+), 33 deletions(-)

Comments

Peter Xu Feb. 7, 2024, 3:39 a.m. UTC | #1

On Tue, Feb 06, 2024 at 11:19:02PM +0000, Hao Xiang wrote:
> This patchset is based on Juan Quintela's old series here
> https://lore.kernel.org/all/20220802063907.18882-1-quintela@redhat.com/
> 
> In the multifd live migration model, there is a single migration main
> thread scanning the page map, queuing the pages to multiple multifd
> sender threads. The migration main thread runs zero page checking on
> every page before queuing the page to the sender threads. Zero page
> checking is a CPU intensive task and hence having a single thread doing
> all that doesn't scale well. This change introduces a new function
> to run the zero page checking on the multifd sender threads. This
> patchset also lays the ground work for future changes to offload zero
> page checking task to accelerator hardwares.
> 
> Use two Intel 4th generation Xeon servers for testing.
> 
> Architecture:        x86_64
> CPU(s):              192
> Thread(s) per core:  2
> Core(s) per socket:  48
> Socket(s):           2
> NUMA node(s):        2
> Vendor ID:           GenuineIntel
> CPU family:          6
> Model:               143
> Model name:          Intel(R) Xeon(R) Platinum 8457C
> Stepping:            8
> CPU MHz:             2538.624
> CPU max MHz:         3800.0000
> CPU min MHz:         800.0000
> 
> Perform multifd live migration with below setup:
> 1. VM has 100GB memory. All pages in the VM are zero pages.
> 2. Use tcp socket for live migratio.
> 3. Use 4 multifd channels and zero page checking on migration main thread.
> 4. Use 1/2/4 multifd channels and zero page checking on multifd sender
> threads.
> 5. Record migration total time from sender QEMU console's "info migrate"
> command.
> 6. Calculate throughput with "100GB / total time".
> 
> +------------------------------------------------------+
> |zero-page-checking | total-time(ms) | throughput(GB/s)|
> +------------------------------------------------------+
> |main-thread        | 9629           | 10.38GB/s       |
> +------------------------------------------------------+
> |multifd-1-threads  | 6182           | 16.17GB/s       |
> +------------------------------------------------------+
> |multifd-2-threads  | 4643           | 21.53GB/s       |
> +------------------------------------------------------+
> |multifd-4-threads  | 4143           | 24.13GB/s       |
> +------------------------------------------------------+

This "throughput" is slightly confusing; I was initially surprised to see a
large throughput for idle guests.  IMHO the "total-time" would explain.
Feel free to drop that column if there's a repost.

Did you check why 4 channels mostly already reached the top line?  Is it
because main thread is already spinning 100%?

Thanks,

Hao Xiang Feb. 8, 2024, 12:47 a.m. UTC | #2

On Tue, Feb 6, 2024 at 7:39 PM Peter Xu <peterx@redhat.com> wrote:
>
> On Tue, Feb 06, 2024 at 11:19:02PM +0000, Hao Xiang wrote:
> > This patchset is based on Juan Quintela's old series here
> > https://lore.kernel.org/all/20220802063907.18882-1-quintela@redhat.com/
> >
> > In the multifd live migration model, there is a single migration main
> > thread scanning the page map, queuing the pages to multiple multifd
> > sender threads. The migration main thread runs zero page checking on
> > every page before queuing the page to the sender threads. Zero page
> > checking is a CPU intensive task and hence having a single thread doing
> > all that doesn't scale well. This change introduces a new function
> > to run the zero page checking on the multifd sender threads. This
> > patchset also lays the ground work for future changes to offload zero
> > page checking task to accelerator hardwares.
> >
> > Use two Intel 4th generation Xeon servers for testing.
> >
> > Architecture:        x86_64
> > CPU(s):              192
> > Thread(s) per core:  2
> > Core(s) per socket:  48
> > Socket(s):           2
> > NUMA node(s):        2
> > Vendor ID:           GenuineIntel
> > CPU family:          6
> > Model:               143
> > Model name:          Intel(R) Xeon(R) Platinum 8457C
> > Stepping:            8
> > CPU MHz:             2538.624
> > CPU max MHz:         3800.0000
> > CPU min MHz:         800.0000
> >
> > Perform multifd live migration with below setup:
> > 1. VM has 100GB memory. All pages in the VM are zero pages.
> > 2. Use tcp socket for live migratio.
> > 3. Use 4 multifd channels and zero page checking on migration main thread.
> > 4. Use 1/2/4 multifd channels and zero page checking on multifd sender
> > threads.
> > 5. Record migration total time from sender QEMU console's "info migrate"
> > command.
> > 6. Calculate throughput with "100GB / total time".
> >
> > +------------------------------------------------------+
> > |zero-page-checking | total-time(ms) | throughput(GB/s)|
> > +------------------------------------------------------+
> > |main-thread        | 9629           | 10.38GB/s       |
> > +------------------------------------------------------+
> > |multifd-1-threads  | 6182           | 16.17GB/s       |
> > +------------------------------------------------------+
> > |multifd-2-threads  | 4643           | 21.53GB/s       |
> > +------------------------------------------------------+
> > |multifd-4-threads  | 4143           | 24.13GB/s       |
> > +------------------------------------------------------+
>
> This "throughput" is slightly confusing; I was initially surprised to see a
> large throughput for idle guests.  IMHO the "total-time" would explain.
> Feel free to drop that column if there's a repost.
>
> Did you check why 4 channels mostly already reached the top line?  Is it
> because main thread is already spinning 100%?
>
> Thanks,
>
> --
> Peter Xu

Sure I will drop "throughput" to avoid confusion. In my testing, 1
multifd channel already makes the main thread spin at 100%. So the
total-time is the same across 1/2/4 multifd channels as long as zero
page is run on the main migration thread. Of course, this is based on
the fact that the network is not the bottleneck. One interesting
finding is that multifd 1 channel with multifd zero page has better
performance than multifd 1 channel with main migration thread.
>

Peter Xu Feb. 8, 2024, 2:36 a.m. UTC | #3

On Wed, Feb 07, 2024 at 04:47:27PM -0800, Hao Xiang wrote:
> Sure I will drop "throughput" to avoid confusion. In my testing, 1
> multifd channel already makes the main thread spin at 100%. So the
> total-time is the same across 1/2/4 multifd channels as long as zero
> page is run on the main migration thread. Of course, this is based on
> the fact that the network is not the bottleneck. One interesting
> finding is that multifd 1 channel with multifd zero page has better
> performance than multifd 1 channel with main migration thread.

It's probably because the main thread has even more works to do than
"detecting zero page" alone.

When zero detection is done in main thread and when the guest is fully
idle, it'll consume a major portion of main thread cpu resource scanning
those pages already.  Consider all pages zero, multifd threads should be
fully idle, so n_channels may not matter here.

When 1 multifd thread created with zero-page offloading, zero page is fully
offloaded from main -> multifd thread even if only one.  It's kind of a
similar effect of forking the main thread into two threads, so the main
thread can be more efficient on other tasks (fetching/scanning dirty bits,
etc.).

Thanks,