From patchwork Fri Jun 8 08:10:38 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wang, Wei W" X-Patchwork-Id: 926641 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=intel.com Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 412G5N0YKtz9s01 for ; Fri, 8 Jun 2018 18:38:16 +1000 (AEST) Received: from localhost ([::1]:33986 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fRCuT-00071p-Nq for incoming@patchwork.ozlabs.org; Fri, 08 Jun 2018 04:38:13 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46440) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fRCsW-00062B-07 for qemu-devel@nongnu.org; Fri, 08 Jun 2018 04:36:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fRCsT-0000Nc-Ca for qemu-devel@nongnu.org; Fri, 08 Jun 2018 04:36:12 -0400 Received: from mga09.intel.com ([134.134.136.24]:58061) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fRCsT-0000MG-13 for qemu-devel@nongnu.org; Fri, 08 Jun 2018 04:36:09 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 08 Jun 2018 01:36:08 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,490,1520924400"; d="scan'208";a="65356014" Received: from devel-ww.sh.intel.com ([10.239.48.110]) by orsmga002.jf.intel.com with ESMTP; 08 Jun 2018 01:36:05 -0700 From: Wei Wang To: qemu-devel@nongnu.org, virtio-dev@lists.oasis-open.org, mst@redhat.com, quintela@redhat.com, dgilbert@redhat.com Date: Fri, 8 Jun 2018 16:10:38 +0800 Message-Id: <1528445443-43406-2-git-send-email-wei.w.wang@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1528445443-43406-1-git-send-email-wei.w.wang@intel.com> References: <1528445443-43406-1-git-send-email-wei.w.wang@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 134.134.136.24 Subject: [Qemu-devel] [PATCH v8 1/6] bitmap: bitmap_count_one_with_offset X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: yang.zhang.wz@gmail.com, quan.xu0@gmail.com, liliang.opensource@gmail.com, peterx@redhat.com, wei.w.wang@intel.com, pbonzini@redhat.com, nilal@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Count the number of 1s in a bitmap starting from an offset. Signed-off-by: Wei Wang CC: Dr. David Alan Gilbert CC: Juan Quintela CC: Michael S. Tsirkin Reviewed-by: Dr. David Alan Gilbert --- include/qemu/bitmap.h | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/include/qemu/bitmap.h b/include/qemu/bitmap.h index 509eedd..e3f31f1 100644 --- a/include/qemu/bitmap.h +++ b/include/qemu/bitmap.h @@ -228,6 +228,19 @@ static inline long bitmap_count_one(const unsigned long *bitmap, long nbits) } } +static inline long bitmap_count_one_with_offset(const unsigned long *bitmap, + long offset, long nbits) +{ + long aligned_offset = QEMU_ALIGN_DOWN(offset, BITS_PER_LONG); + long redundant_bits = offset - aligned_offset; + long bits_to_count = nbits + redundant_bits; + const unsigned long *bitmap_start = bitmap + + aligned_offset / BITS_PER_LONG; + + return bitmap_count_one(bitmap_start, bits_to_count) - + bitmap_count_one(bitmap_start, redundant_bits); +} + void bitmap_set(unsigned long *map, long i, long len); void bitmap_set_atomic(unsigned long *map, long i, long len); void bitmap_clear(unsigned long *map, long start, long nr); From patchwork Fri Jun 8 08:10:39 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wang, Wei W" X-Patchwork-Id: 926643 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=intel.com Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 412G8M1BCxz9s01 for ; Fri, 8 Jun 2018 18:40:51 +1000 (AEST) Received: from localhost ([::1]:33999 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fRCwy-0000ek-O6 for incoming@patchwork.ozlabs.org; Fri, 08 Jun 2018 04:40:48 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46476) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fRCsX-00063O-My for qemu-devel@nongnu.org; Fri, 08 Jun 2018 04:36:14 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fRCsW-0000PR-Tf for qemu-devel@nongnu.org; Fri, 08 Jun 2018 04:36:13 -0400 Received: from mga09.intel.com ([134.134.136.24]:58067) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fRCsW-0000OZ-Kj for qemu-devel@nongnu.org; Fri, 08 Jun 2018 04:36:12 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 08 Jun 2018 01:36:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,490,1520924400"; d="scan'208";a="65356022" Received: from devel-ww.sh.intel.com ([10.239.48.110]) by orsmga002.jf.intel.com with ESMTP; 08 Jun 2018 01:36:08 -0700 From: Wei Wang To: qemu-devel@nongnu.org, virtio-dev@lists.oasis-open.org, mst@redhat.com, quintela@redhat.com, dgilbert@redhat.com Date: Fri, 8 Jun 2018 16:10:39 +0800 Message-Id: <1528445443-43406-3-git-send-email-wei.w.wang@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1528445443-43406-1-git-send-email-wei.w.wang@intel.com> References: <1528445443-43406-1-git-send-email-wei.w.wang@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 134.134.136.24 Subject: [Qemu-devel] [PATCH v8 2/6] migration: use bitmap_mutex in migration_bitmap_clear_dirty X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: yang.zhang.wz@gmail.com, quan.xu0@gmail.com, liliang.opensource@gmail.com, peterx@redhat.com, wei.w.wang@intel.com, pbonzini@redhat.com, nilal@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" The bitmap mutex is used to synchronize threads to update the dirty bitmap and the migration_dirty_pages counter. For example, the free page optimization clears bits of free pages from the bitmap in an iothread context. This patch makes migration_bitmap_clear_dirty update the bitmap and counter under the mutex. Signed-off-by: Wei Wang CC: Dr. David Alan Gilbert CC: Juan Quintela CC: Michael S. Tsirkin CC: Peter Xu --- migration/ram.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/migration/ram.c b/migration/ram.c index c53e836..2eabbe9 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -1093,11 +1093,14 @@ static inline bool migration_bitmap_clear_dirty(RAMState *rs, { bool ret; + qemu_mutex_lock(&rs->bitmap_mutex); ret = test_and_clear_bit(page, rb->bmap); if (ret) { rs->migration_dirty_pages--; } + qemu_mutex_unlock(&rs->bitmap_mutex); + return ret; } From patchwork Fri Jun 8 08:10:40 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wang, Wei W" X-Patchwork-Id: 926639 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=intel.com Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 412G3w3wB7z9s01 for ; Fri, 8 Jun 2018 18:36:59 +1000 (AEST) Received: from localhost ([::1]:33981 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fRCtC-00066u-0B for incoming@patchwork.ozlabs.org; Fri, 08 Jun 2018 04:36:54 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46486) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fRCsY-00064E-P6 for qemu-devel@nongnu.org; Fri, 08 Jun 2018 04:36:15 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fRCsX-0000Q4-OZ for qemu-devel@nongnu.org; Fri, 08 Jun 2018 04:36:14 -0400 Received: from mga09.intel.com ([134.134.136.24]:58067) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fRCsX-0000OZ-EP for qemu-devel@nongnu.org; Fri, 08 Jun 2018 04:36:13 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 08 Jun 2018 01:36:13 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,490,1520924400"; d="scan'208";a="65356028" Received: from devel-ww.sh.intel.com ([10.239.48.110]) by orsmga002.jf.intel.com with ESMTP; 08 Jun 2018 01:36:10 -0700 From: Wei Wang To: qemu-devel@nongnu.org, virtio-dev@lists.oasis-open.org, mst@redhat.com, quintela@redhat.com, dgilbert@redhat.com Date: Fri, 8 Jun 2018 16:10:40 +0800 Message-Id: <1528445443-43406-4-git-send-email-wei.w.wang@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1528445443-43406-1-git-send-email-wei.w.wang@intel.com> References: <1528445443-43406-1-git-send-email-wei.w.wang@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 134.134.136.24 Subject: [Qemu-devel] [PATCH v8 3/6] migration: API to clear bits of guest free pages from the dirty bitmap X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: yang.zhang.wz@gmail.com, quan.xu0@gmail.com, liliang.opensource@gmail.com, peterx@redhat.com, wei.w.wang@intel.com, pbonzini@redhat.com, nilal@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" This patch adds an API to clear bits corresponding to guest free pages from the dirty bitmap. Spilt the free page block if it crosses the QEMU RAMBlock boundary. Signed-off-by: Wei Wang CC: Dr. David Alan Gilbert CC: Juan Quintela CC: Michael S. Tsirkin CC: Peter Xu --- include/migration/misc.h | 2 ++ migration/migration.c | 2 +- migration/migration.h | 1 + migration/ram.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 52 insertions(+), 1 deletion(-) diff --git a/include/migration/misc.h b/include/migration/misc.h index 4ebf24c..113320e 100644 --- a/include/migration/misc.h +++ b/include/migration/misc.h @@ -14,11 +14,13 @@ #ifndef MIGRATION_MISC_H #define MIGRATION_MISC_H +#include "exec/cpu-common.h" #include "qemu/notify.h" /* migration/ram.c */ void ram_mig_init(void); +void qemu_guest_free_page_hint(void *addr, size_t len); /* migration/block.c */ diff --git a/migration/migration.c b/migration/migration.c index 05aec2c..220ff48 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -647,7 +647,7 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp) * Return true if we're already in the middle of a migration * (i.e. any of the active or setup states) */ -static bool migration_is_setup_or_active(int state) +bool migration_is_setup_or_active(int state) { switch (state) { case MIGRATION_STATUS_ACTIVE: diff --git a/migration/migration.h b/migration/migration.h index 8f0c821..5a74740 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -230,6 +230,7 @@ void migrate_fd_error(MigrationState *s, const Error *error); void migrate_fd_connect(MigrationState *s, Error *error_in); void migrate_init(MigrationState *s); +bool migration_is_setup_or_active(int state); bool migration_is_blocked(Error **errp); /* True if outgoing migration has entered postcopy phase */ bool migration_in_postcopy(void); diff --git a/migration/ram.c b/migration/ram.c index 2eabbe9..237f11e 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -2530,6 +2530,54 @@ static void ram_state_resume_prepare(RAMState *rs, QEMUFile *out) } /* + * This function clears bits of the free pages reported by the caller from the + * migration dirty bitmap. @addr is the host address corresponding to the + * start of the continuous guest free pages, and @len is the total bytes of + * those pages. + */ +void qemu_guest_free_page_hint(void *addr, size_t len) +{ + RAMBlock *block; + ram_addr_t offset; + size_t used_len, start, npages; + MigrationState *s = migrate_get_current(); + + /* This function is currently expected to be used during live migration */ + if (!migration_is_setup_or_active(s->state)) { + return; + } + + for (; len > 0; len -= used_len) { + block = qemu_ram_block_from_host(addr, false, &offset); + assert(block); + + /* + * This handles the case that the RAMBlock is resized after the free + * page hint is reported. + */ + if (unlikely(offset > block->used_length)) { + return; + } + + if (len <= block->used_length - offset) { + used_len = len; + } else { + used_len = block->used_length - offset; + addr += used_len; + } + + start = offset >> TARGET_PAGE_BITS; + npages = used_len >> TARGET_PAGE_BITS; + + qemu_mutex_lock(&ram_state->bitmap_mutex); + ram_state->migration_dirty_pages -= + bitmap_count_one_with_offset(block->bmap, start, npages); + bitmap_clear(block->bmap, start, npages); + qemu_mutex_unlock(&ram_state->bitmap_mutex); + } +} + +/* * Each of ram_save_setup, ram_save_iterate and ram_save_complete has * long-running RCU critical section. When rcu-reclaims in the code * start to become numerous it will be necessary to reduce the From patchwork Fri Jun 8 08:10:41 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wang, Wei W" X-Patchwork-Id: 926644 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=intel.com Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 412GBr05Ndz9s01 for ; Fri, 8 Jun 2018 18:43:00 +1000 (AEST) Received: from localhost ([::1]:34015 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fRCz3-0002Yl-LU for incoming@patchwork.ozlabs.org; Fri, 08 Jun 2018 04:42:57 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46520) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fRCsb-00066X-Fj for qemu-devel@nongnu.org; Fri, 08 Jun 2018 04:36:19 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fRCsa-0000RD-8i for qemu-devel@nongnu.org; Fri, 08 Jun 2018 04:36:17 -0400 Received: from mga09.intel.com ([134.134.136.24]:58067) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fRCsZ-0000OZ-S5 for qemu-devel@nongnu.org; Fri, 08 Jun 2018 04:36:16 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 08 Jun 2018 01:36:15 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,490,1520924400"; d="scan'208";a="65356036" Received: from devel-ww.sh.intel.com ([10.239.48.110]) by orsmga002.jf.intel.com with ESMTP; 08 Jun 2018 01:36:13 -0700 From: Wei Wang To: qemu-devel@nongnu.org, virtio-dev@lists.oasis-open.org, mst@redhat.com, quintela@redhat.com, dgilbert@redhat.com Date: Fri, 8 Jun 2018 16:10:41 +0800 Message-Id: <1528445443-43406-5-git-send-email-wei.w.wang@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1528445443-43406-1-git-send-email-wei.w.wang@intel.com> References: <1528445443-43406-1-git-send-email-wei.w.wang@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 134.134.136.24 Subject: [Qemu-devel] [PATCH v8 4/6] migration/ram.c: add ram save state notifiers X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: yang.zhang.wz@gmail.com, quan.xu0@gmail.com, liliang.opensource@gmail.com, peterx@redhat.com, wei.w.wang@intel.com, pbonzini@redhat.com, nilal@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" This patch adds a ram save state notifier list, and expose RAMState for the notifer callbacks to use. Signed-off-by: Wei Wang CC: Dr. David Alan Gilbert CC: Juan Quintela CC: Michael S. Tsirkin CC: Peter Xu --- include/migration/misc.h | 52 +++++++++++++++++++++++++++++++++++++++ migration/ram.c | 64 +++++++++++++++++------------------------------- 2 files changed, 75 insertions(+), 41 deletions(-) diff --git a/include/migration/misc.h b/include/migration/misc.h index 113320e..b970d7d 100644 --- a/include/migration/misc.h +++ b/include/migration/misc.h @@ -16,9 +16,61 @@ #include "exec/cpu-common.h" #include "qemu/notify.h" +#include "qemu/thread.h" /* migration/ram.c */ +typedef enum RamSaveState { + RAM_SAVE_ERR = 0, + RAM_SAVE_RESET = 1, + RAM_SAVE_BEFORE_SYNC_BITMAP = 2, + RAM_SAVE_AFTER_SYNC_BITMAP = 3, + RAM_SAVE_MAX = 4, +} RamSaveState; + +/* State of RAM for migration */ +typedef struct RAMState { + /* QEMUFile used for this migration */ + QEMUFile *f; + /* Last block that we have visited searching for dirty pages */ + RAMBlock *last_seen_block; + /* Last block from where we have sent data */ + RAMBlock *last_sent_block; + /* Last dirty target page we have sent */ + ram_addr_t last_page; + /* last ram version we have seen */ + uint32_t last_version; + /* We are in the first round */ + bool ram_bulk_stage; + /* How many times we have dirty too many pages */ + int dirty_rate_high_cnt; + /* ram save states used for notifiers */ + int ram_save_state; + /* these variables are used for bitmap sync */ + /* last time we did a full bitmap_sync */ + int64_t time_last_bitmap_sync; + /* bytes transferred at start_time */ + uint64_t bytes_xfer_prev; + /* number of dirty pages since start_time */ + uint64_t num_dirty_pages_period; + /* xbzrle misses since the beginning of the period */ + uint64_t xbzrle_cache_miss_prev; + /* number of iterations at the beginning of period */ + uint64_t iterations_prev; + /* Iterations since start */ + uint64_t iterations; + /* number of dirty bits in the bitmap */ + uint64_t migration_dirty_pages; + /* protects modification of the bitmap */ + QemuMutex bitmap_mutex; + /* The RAMBlock used in the last src_page_requests */ + RAMBlock *last_req_rb; + /* Queue of outstanding page requests from the destination */ + QemuMutex src_page_req_mutex; + QSIMPLEQ_HEAD(src_page_requests, RAMSrcPageRequest) src_page_requests; +} RAMState; + +void add_ram_save_state_change_notifier(Notifier *notify); void ram_mig_init(void); void qemu_guest_free_page_hint(void *addr, size_t len); diff --git a/migration/ram.c b/migration/ram.c index 237f11e..d45b5a4 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -56,6 +56,9 @@ #include "qemu/uuid.h" #include "savevm.h" +static NotifierList ram_save_state_notifiers = + NOTIFIER_LIST_INITIALIZER(ram_save_state_notifiers); + /***********************************************************/ /* ram save/restore */ @@ -267,49 +270,18 @@ struct RAMSrcPageRequest { QSIMPLEQ_ENTRY(RAMSrcPageRequest) next_req; }; -/* State of RAM for migration */ -struct RAMState { - /* QEMUFile used for this migration */ - QEMUFile *f; - /* Last block that we have visited searching for dirty pages */ - RAMBlock *last_seen_block; - /* Last block from where we have sent data */ - RAMBlock *last_sent_block; - /* Last dirty target page we have sent */ - ram_addr_t last_page; - /* last ram version we have seen */ - uint32_t last_version; - /* We are in the first round */ - bool ram_bulk_stage; - /* How many times we have dirty too many pages */ - int dirty_rate_high_cnt; - /* these variables are used for bitmap sync */ - /* last time we did a full bitmap_sync */ - int64_t time_last_bitmap_sync; - /* bytes transferred at start_time */ - uint64_t bytes_xfer_prev; - /* number of dirty pages since start_time */ - uint64_t num_dirty_pages_period; - /* xbzrle misses since the beginning of the period */ - uint64_t xbzrle_cache_miss_prev; - /* number of iterations at the beginning of period */ - uint64_t iterations_prev; - /* Iterations since start */ - uint64_t iterations; - /* number of dirty bits in the bitmap */ - uint64_t migration_dirty_pages; - /* protects modification of the bitmap */ - QemuMutex bitmap_mutex; - /* The RAMBlock used in the last src_page_requests */ - RAMBlock *last_req_rb; - /* Queue of outstanding page requests from the destination */ - QemuMutex src_page_req_mutex; - QSIMPLEQ_HEAD(src_page_requests, RAMSrcPageRequest) src_page_requests; -}; -typedef struct RAMState RAMState; - static RAMState *ram_state; +void add_ram_save_state_change_notifier(Notifier *notify) +{ + notifier_list_add(&ram_save_state_notifiers, notify); +} + +static void notify_ram_save_state_change_notifier(void) +{ + notifier_list_notify(&ram_save_state_notifiers, ram_state); +} + uint64_t ram_bytes_remaining(void) { return ram_state ? (ram_state->migration_dirty_pages * TARGET_PAGE_SIZE) : @@ -1139,6 +1111,9 @@ static void migration_bitmap_sync(RAMState *rs) int64_t end_time; uint64_t bytes_xfer_now; + rs->ram_save_state = RAM_SAVE_BEFORE_SYNC_BITMAP; + notify_ram_save_state_change_notifier(); + ram_counters.dirty_sync_count++; if (!rs->time_last_bitmap_sync) { @@ -1205,6 +1180,9 @@ static void migration_bitmap_sync(RAMState *rs) if (migrate_use_events()) { qapi_event_send_migration_pass(ram_counters.dirty_sync_count, NULL); } + + rs->ram_save_state = RAM_SAVE_AFTER_SYNC_BITMAP; + notify_ram_save_state_change_notifier(); } /** @@ -1961,6 +1939,8 @@ static void ram_state_reset(RAMState *rs) rs->last_page = 0; rs->last_version = ram_list.version; rs->ram_bulk_stage = true; + rs->ram_save_state = RAM_SAVE_RESET; + notify_ram_save_state_change_notifier(); } #define MAX_WAIT 50 /* ms, half buffered_file limit */ @@ -2709,6 +2689,8 @@ out: ret = qemu_file_get_error(f); if (ret < 0) { + rs->ram_save_state = RAM_SAVE_ERR; + notify_ram_save_state_change_notifier(); return ret; } From patchwork Fri Jun 8 08:10:42 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wang, Wei W" X-Patchwork-Id: 926642 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=intel.com Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 412G7H3LMdz9s3q for ; Fri, 8 Jun 2018 18:39:55 +1000 (AEST) Received: from localhost ([::1]:33992 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fRCw5-0008AT-3g for incoming@patchwork.ozlabs.org; Fri, 08 Jun 2018 04:39:53 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46543) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fRCsf-00068A-GX for qemu-devel@nongnu.org; Fri, 08 Jun 2018 04:36:22 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fRCsc-0000St-Cb for qemu-devel@nongnu.org; Fri, 08 Jun 2018 04:36:21 -0400 Received: from mga09.intel.com ([134.134.136.24]:58067) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fRCsc-0000OZ-2k for qemu-devel@nongnu.org; Fri, 08 Jun 2018 04:36:18 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 08 Jun 2018 01:36:17 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,490,1520924400"; d="scan'208";a="65356045" Received: from devel-ww.sh.intel.com ([10.239.48.110]) by orsmga002.jf.intel.com with ESMTP; 08 Jun 2018 01:36:15 -0700 From: Wei Wang To: qemu-devel@nongnu.org, virtio-dev@lists.oasis-open.org, mst@redhat.com, quintela@redhat.com, dgilbert@redhat.com Date: Fri, 8 Jun 2018 16:10:42 +0800 Message-Id: <1528445443-43406-6-git-send-email-wei.w.wang@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1528445443-43406-1-git-send-email-wei.w.wang@intel.com> References: <1528445443-43406-1-git-send-email-wei.w.wang@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 134.134.136.24 Subject: [Qemu-devel] [PATCH v8 5/6] migration: move migrate_postcopy() to include/migration/misc.h X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: yang.zhang.wz@gmail.com, quan.xu0@gmail.com, liliang.opensource@gmail.com, peterx@redhat.com, wei.w.wang@intel.com, pbonzini@redhat.com, nilal@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" The ram save state notifier callback, for example the free page optimization offerred by virtio-balloon, may need to check if postcopy is in use, so move migrate_postcopy() to the outside header. Signed-off-by: Wei Wang CC: Dr. David Alan Gilbert CC: Juan Quintela CC: Michael S. Tsirkin CC: Peter Xu Reviewed-by: Dr. David Alan Gilbert --- include/migration/misc.h | 1 + migration/migration.h | 2 -- 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/include/migration/misc.h b/include/migration/misc.h index b970d7d..911aaf3 100644 --- a/include/migration/misc.h +++ b/include/migration/misc.h @@ -109,6 +109,7 @@ bool migration_has_failed(MigrationState *); /* ...and after the device transmission */ bool migration_in_postcopy_after_devices(MigrationState *); void migration_global_dump(Monitor *mon); +bool migrate_postcopy(void); /* migration/block-dirty-bitmap.c */ void dirty_bitmap_mig_init(void); diff --git a/migration/migration.h b/migration/migration.h index 5a74740..fee5af8 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -236,8 +236,6 @@ bool migration_is_blocked(Error **errp); bool migration_in_postcopy(void); MigrationState *migrate_get_current(void); -bool migrate_postcopy(void); - bool migrate_release_ram(void); bool migrate_postcopy_ram(void); bool migrate_zero_blocks(void); From patchwork Fri Jun 8 08:10:43 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wang, Wei W" X-Patchwork-Id: 926645 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=intel.com Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 412GDv5Qynz9s01 for ; Fri, 8 Jun 2018 18:44:47 +1000 (AEST) Received: from localhost ([::1]:34028 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fRD0n-0003kg-D6 for incoming@patchwork.ozlabs.org; Fri, 08 Jun 2018 04:44:45 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46563) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fRCsh-00069g-1A for qemu-devel@nongnu.org; Fri, 08 Jun 2018 04:36:25 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fRCsf-0000U5-37 for qemu-devel@nongnu.org; Fri, 08 Jun 2018 04:36:23 -0400 Received: from mga09.intel.com ([134.134.136.24]:58067) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fRCse-0000OZ-KD for qemu-devel@nongnu.org; Fri, 08 Jun 2018 04:36:21 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 08 Jun 2018 01:36:20 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,490,1520924400"; d="scan'208";a="65356052" Received: from devel-ww.sh.intel.com ([10.239.48.110]) by orsmga002.jf.intel.com with ESMTP; 08 Jun 2018 01:36:17 -0700 From: Wei Wang To: qemu-devel@nongnu.org, virtio-dev@lists.oasis-open.org, mst@redhat.com, quintela@redhat.com, dgilbert@redhat.com Date: Fri, 8 Jun 2018 16:10:43 +0800 Message-Id: <1528445443-43406-7-git-send-email-wei.w.wang@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1528445443-43406-1-git-send-email-wei.w.wang@intel.com> References: <1528445443-43406-1-git-send-email-wei.w.wang@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 134.134.136.24 Subject: [Qemu-devel] [PATCH v8 6/6] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: yang.zhang.wz@gmail.com, quan.xu0@gmail.com, liliang.opensource@gmail.com, peterx@redhat.com, wei.w.wang@intel.com, pbonzini@redhat.com, nilal@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" The new feature enables the virtio-balloon device to receive hints of guest free pages from the free page vq. A notifier is registered to the migration ram save state notifier list. The notifier calls free_page_start after the migration thread syncs the dirty bitmap, so that the free page hinting optimization starts to clear bits of free pages from the bitmap. It calls the free_page_stop before the migration thread syncs the bitmap, which is the end of the current round of ram save. The free_page_stop is also called to stop the optimization in the case there is an error happened in the process of ram save. Note: balloon will report pages which were free at the time of this call. As the reporting happens asynchronously, dirty bit logging must be enabled before this free_page_start call is made. Guest reporting must be disabled before the migration dirty bitmap is synchronized. TODO: - If free pages are poisoned by guest, the hints are dropped currently. We will support clearing bits of poisoned pages from the bitmap in the future. Signed-off-by: Wei Wang CC: Michael S. Tsirkin CC: Dr. David Alan Gilbert CC: Juan Quintela CC: Peter Xu --- hw/virtio/virtio-balloon.c | 260 ++++++++++++++++++++++++ include/hw/virtio/virtio-balloon.h | 28 ++- include/standard-headers/linux/virtio_balloon.h | 7 + 3 files changed, 294 insertions(+), 1 deletion(-) diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c index 1f7a87f..a7bb971 100644 --- a/hw/virtio/virtio-balloon.c +++ b/hw/virtio/virtio-balloon.c @@ -28,6 +28,7 @@ #include "qapi/visitor.h" #include "trace.h" #include "qemu/error-report.h" +#include "migration/misc.h" #include "hw/virtio/virtio-bus.h" #include "hw/virtio/virtio-access.h" @@ -310,6 +311,166 @@ out: } } +static void get_free_page_hints(VirtIOBalloon *dev) +{ + VirtQueueElement *elem; + VirtIODevice *vdev = VIRTIO_DEVICE(dev); + VirtQueue *vq = dev->free_page_vq; + uint32_t id; + size_t size; + + while (dev->block_iothread) { + qemu_cond_wait(&dev->free_page_cond, &dev->free_page_lock); + } + + /* + * If the migration thread actively stops the reporting, exit + * immediately. + */ + if (dev->free_page_report_status == FREE_PAGE_REPORT_S_STOP) { + return; + } + + elem = virtqueue_pop(vq, sizeof(VirtQueueElement)); + if (!elem) { + return; + } + + if (elem->out_num) { + size = iov_to_buf(elem->out_sg, elem->out_num, 0, &id, sizeof(id)); + virtqueue_push(vq, elem, size); + g_free(elem); + + virtio_tswap32s(vdev, &id); + if (unlikely(size != sizeof(id))) { + virtio_error(vdev, "received an incorrect cmd id"); + return; + } + if (id == dev->free_page_report_cmd_id) { + dev->free_page_report_status = FREE_PAGE_REPORT_S_START; + } else { + /* + * Stop the optimization only when it has started. This + * avoids a stale stop sign for the previous command. + */ + if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START) { + dev->free_page_report_status = FREE_PAGE_REPORT_S_STOP; + return; + } + } + } + + if (elem->in_num) { + /* TODO: send the poison value to the destination */ + if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START && + !dev->poison_val) { + qemu_guest_free_page_hint(elem->in_sg[0].iov_base, + elem->in_sg[0].iov_len); + } + virtqueue_push(vq, elem, 0); + g_free(elem); + } +} + +static void virtio_balloon_poll_free_page_hints(void *opaque) +{ + VirtIOBalloon *dev = opaque; + VirtIODevice *vdev = VIRTIO_DEVICE(dev); + VirtQueue *vq = dev->free_page_vq; + + while (dev->free_page_report_status != FREE_PAGE_REPORT_S_STOP) { + qemu_mutex_lock(&dev->free_page_lock); + get_free_page_hints(dev); + qemu_mutex_unlock(&dev->free_page_lock); + } + virtio_notify(vdev, vq); +} + +static bool virtio_balloon_free_page_support(void *opaque) +{ + VirtIOBalloon *s = opaque; + VirtIODevice *vdev = VIRTIO_DEVICE(s); + + return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT); +} + +static bool virtio_balloon_page_poison_support(void *opaque) +{ + VirtIOBalloon *s = opaque; + VirtIODevice *vdev = VIRTIO_DEVICE(s); + + return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_PAGE_POISON); +} + +static void virtio_balloon_free_page_start(void *opaque) +{ + VirtIOBalloon *s = opaque; + VirtIODevice *vdev = VIRTIO_DEVICE(s); + + /* For the stop and copy phase, we don't need to start the optimization */ + if (!vdev->vm_running) { + return; + } + + if (s->free_page_report_cmd_id == UINT_MAX) { + s->free_page_report_cmd_id = + VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN; + } else { + s->free_page_report_cmd_id++; + } + + s->free_page_report_status = FREE_PAGE_REPORT_S_REQUESTED; + virtio_notify_config(vdev); + qemu_bh_schedule(s->free_page_bh); +} + +static void virtio_balloon_free_page_stop(void *opaque) +{ + VirtIOBalloon *s = opaque; + VirtIODevice *vdev = VIRTIO_DEVICE(s); + + if (s->free_page_report_status != FREE_PAGE_REPORT_S_STOP) { + /* + * The lock also guarantees us that the + * virtio_balloon_poll_free_page_hints exits after the + * free_page_report_status is set to S_STOP. + */ + qemu_mutex_lock(&s->free_page_lock); + /* + * The guest hasn't done the reporting, so host sends a notification + * to the guest to actively stop the reporting. + */ + s->free_page_report_status = FREE_PAGE_REPORT_S_STOP; + qemu_mutex_unlock(&s->free_page_lock); + virtio_notify_config(vdev); + } +} + +static void virtio_balloon_free_page_report_notify(Notifier *notifier, + void *data) +{ + VirtIOBalloon *dev = container_of(notifier, VirtIOBalloon, + free_page_report_notify); + RAMState *rs = data; + + if (!virtio_balloon_free_page_support(dev) || migrate_postcopy()) { + return; + } + + switch (rs->ram_save_state) { + case RAM_SAVE_RESET: + rs->ram_bulk_stage = false; + break; + case RAM_SAVE_ERR: + case RAM_SAVE_BEFORE_SYNC_BITMAP: + virtio_balloon_free_page_stop(dev); + break; + case RAM_SAVE_AFTER_SYNC_BITMAP: + virtio_balloon_free_page_start(dev); + break; + } +} + static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data) { VirtIOBalloon *dev = VIRTIO_BALLOON(vdev); @@ -317,6 +478,17 @@ static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data) config.num_pages = cpu_to_le32(dev->num_pages); config.actual = cpu_to_le32(dev->actual); + if (virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_STATS_VQ)) { + config.poison_val = cpu_to_le32(dev->poison_val); + } + + if (dev->free_page_report_status == FREE_PAGE_REPORT_S_STOP) { + config.free_page_report_cmd_id = + cpu_to_le32(VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID); + } else { + config.free_page_report_cmd_id = + cpu_to_le32(dev->free_page_report_cmd_id); + } trace_virtio_balloon_get_config(config.num_pages, config.actual); memcpy(config_data, &config, sizeof(struct virtio_balloon_config)); @@ -370,6 +542,7 @@ static void virtio_balloon_set_config(VirtIODevice *vdev, ((ram_addr_t) dev->actual << VIRTIO_BALLOON_PFN_SHIFT), &error_abort); } + dev->poison_val = le32_to_cpu(config.poison_val); trace_virtio_balloon_set_config(dev->actual, oldactual); } @@ -379,6 +552,12 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f, VirtIOBalloon *dev = VIRTIO_BALLOON(vdev); f |= dev->host_features; virtio_add_feature(&f, VIRTIO_BALLOON_F_STATS_VQ); + + if (virtio_has_feature(dev->host_features, + VIRTIO_BALLOON_F_FREE_PAGE_HINT)) { + virtio_add_feature(&f, VIRTIO_BALLOON_F_PAGE_POISON); + } + return f; } @@ -415,6 +594,28 @@ static int virtio_balloon_post_load_device(void *opaque, int version_id) return 0; } +static const VMStateDescription vmstate_virtio_balloon_free_page_report = { + .name = "virtio-balloon-device/free-page-report", + .version_id = 1, + .minimum_version_id = 1, + .needed = virtio_balloon_free_page_support, + .fields = (VMStateField[]) { + VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon), + VMSTATE_END_OF_LIST() + } +}; + +static const VMStateDescription vmstate_virtio_balloon_page_poison = { + .name = "virtio-balloon-device/page-poison", + .version_id = 1, + .minimum_version_id = 1, + .needed = virtio_balloon_page_poison_support, + .fields = (VMStateField[]) { + VMSTATE_UINT32(poison_val, VirtIOBalloon), + VMSTATE_END_OF_LIST() + } +}; + static const VMStateDescription vmstate_virtio_balloon_device = { .name = "virtio-balloon-device", .version_id = 1, @@ -425,6 +626,11 @@ static const VMStateDescription vmstate_virtio_balloon_device = { VMSTATE_UINT32(actual, VirtIOBalloon), VMSTATE_END_OF_LIST() }, + .subsections = (const VMStateDescription * []) { + &vmstate_virtio_balloon_free_page_report, + &vmstate_virtio_balloon_page_poison, + NULL + } }; static void virtio_balloon_device_realize(DeviceState *dev, Error **errp) @@ -449,6 +655,28 @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp) s->dvq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output); s->svq = virtio_add_queue(vdev, 128, virtio_balloon_receive_stats); + if (virtio_has_feature(s->host_features, + VIRTIO_BALLOON_F_FREE_PAGE_HINT)) { + s->free_page_vq = virtio_add_queue(vdev, VIRTQUEUE_MAX_SIZE, NULL); + s->free_page_report_status = FREE_PAGE_REPORT_S_STOP; + s->free_page_report_cmd_id = + VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN; + s->free_page_report_notify.notify = + virtio_balloon_free_page_report_notify; + add_ram_save_state_change_notifier(&s->free_page_report_notify); + if (s->iothread) { + object_ref(OBJECT(s->iothread)); + s->free_page_bh = aio_bh_new(iothread_get_aio_context(s->iothread), + virtio_balloon_poll_free_page_hints, s); + qemu_mutex_init(&s->free_page_lock); + qemu_cond_init(&s->free_page_cond); + s->block_iothread = false; + } else { + /* Simply disable this feature if the iothread wasn't created. */ + s->host_features &= ~(1 << VIRTIO_BALLOON_F_FREE_PAGE_HINT); + virtio_error(vdev, "iothread is missing"); + } + } reset_stats(s); } @@ -457,6 +685,10 @@ static void virtio_balloon_device_unrealize(DeviceState *dev, Error **errp) VirtIODevice *vdev = VIRTIO_DEVICE(dev); VirtIOBalloon *s = VIRTIO_BALLOON(dev); + if (virtio_balloon_free_page_support(s)) { + qemu_bh_delete(s->free_page_bh); + virtio_balloon_free_page_stop(s); + } balloon_stats_destroy_timer(s); qemu_remove_balloon_handler(s); virtio_cleanup(vdev); @@ -466,6 +698,10 @@ static void virtio_balloon_device_reset(VirtIODevice *vdev) { VirtIOBalloon *s = VIRTIO_BALLOON(vdev); + if (virtio_balloon_free_page_support(s)) { + virtio_balloon_free_page_stop(s); + } + if (s->stats_vq_elem != NULL) { virtqueue_unpop(s->svq, s->stats_vq_elem, 0); g_free(s->stats_vq_elem); @@ -483,6 +719,26 @@ static void virtio_balloon_set_status(VirtIODevice *vdev, uint8_t status) * was stopped */ virtio_balloon_receive_stats(vdev, s->svq); } + + if (virtio_balloon_free_page_support(s)) { + /* + * The VM is woken up and the iothread was blocked, so signal it to + * continue. + */ + if (vdev->vm_running && s->block_iothread) { + qemu_mutex_lock(&s->free_page_lock); + s->block_iothread = false; + qemu_cond_signal(&s->free_page_cond); + qemu_mutex_unlock(&s->free_page_lock); + } + + /* The VM is stopped, block the iothread. */ + if (!vdev->vm_running) { + qemu_mutex_lock(&s->free_page_lock); + s->block_iothread = true; + qemu_mutex_unlock(&s->free_page_lock); + } + } } static void virtio_balloon_instance_init(Object *obj) @@ -511,6 +767,10 @@ static const VMStateDescription vmstate_virtio_balloon = { static Property virtio_balloon_properties[] = { DEFINE_PROP_BIT("deflate-on-oom", VirtIOBalloon, host_features, VIRTIO_BALLOON_F_DEFLATE_ON_OOM, false), + DEFINE_PROP_BIT("free-page-hint", VirtIOBalloon, host_features, + VIRTIO_BALLOON_F_FREE_PAGE_HINT, false), + DEFINE_PROP_LINK("iothread", VirtIOBalloon, iothread, TYPE_IOTHREAD, + IOThread *), DEFINE_PROP_END_OF_LIST(), }; diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h index e0df352..e14e545 100644 --- a/include/hw/virtio/virtio-balloon.h +++ b/include/hw/virtio/virtio-balloon.h @@ -17,11 +17,14 @@ #include "standard-headers/linux/virtio_balloon.h" #include "hw/virtio/virtio.h" +#include "sysemu/iothread.h" #define TYPE_VIRTIO_BALLOON "virtio-balloon-device" #define VIRTIO_BALLOON(obj) \ OBJECT_CHECK(VirtIOBalloon, (obj), TYPE_VIRTIO_BALLOON) +#define VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN 0x80000000 + typedef struct virtio_balloon_stat VirtIOBalloonStat; typedef struct virtio_balloon_stat_modern { @@ -30,15 +33,38 @@ typedef struct virtio_balloon_stat_modern { uint64_t val; } VirtIOBalloonStatModern; +enum virtio_balloon_free_page_report_status { + FREE_PAGE_REPORT_S_STOP = 0, + FREE_PAGE_REPORT_S_REQUESTED = 1, + FREE_PAGE_REPORT_S_START = 2, +}; + typedef struct VirtIOBalloon { VirtIODevice parent_obj; - VirtQueue *ivq, *dvq, *svq; + VirtQueue *ivq, *dvq, *svq, *free_page_vq; + uint32_t free_page_report_status; uint32_t num_pages; uint32_t actual; + uint32_t free_page_report_cmd_id; + uint32_t poison_val; uint64_t stats[VIRTIO_BALLOON_S_NR]; VirtQueueElement *stats_vq_elem; size_t stats_vq_offset; QEMUTimer *stats_timer; + IOThread *iothread; + QEMUBH *free_page_bh; + /* + * Lock to synchronize threads to access the free page reporting related + * fields (e.g. free_page_report_status). + */ + QemuMutex free_page_lock; + QemuCond free_page_cond; + /* + * Set to block iothread to continue reading free page hints as the VM is + * stopped. + */ + bool block_iothread; + Notifier free_page_report_notify; int64_t stats_last_update; int64_t stats_poll_interval; uint32_t host_features; diff --git a/include/standard-headers/linux/virtio_balloon.h b/include/standard-headers/linux/virtio_balloon.h index e446805..eb47e6c 100644 --- a/include/standard-headers/linux/virtio_balloon.h +++ b/include/standard-headers/linux/virtio_balloon.h @@ -34,15 +34,22 @@ #define VIRTIO_BALLOON_F_MUST_TELL_HOST 0 /* Tell before reclaiming pages */ #define VIRTIO_BALLOON_F_STATS_VQ 1 /* Memory Stats virtqueue */ #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM 2 /* Deflate balloon on OOM */ +#define VIRTIO_BALLOON_F_FREE_PAGE_HINT 3 /* VQ to report free pages */ +#define VIRTIO_BALLOON_F_PAGE_POISON 4 /* Guest is using page poisoning */ /* Size of a PFN in the balloon interface. */ #define VIRTIO_BALLOON_PFN_SHIFT 12 +#define VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID 0 struct virtio_balloon_config { /* Number of pages host wants Guest to give up. */ uint32_t num_pages; /* Number of pages we've actually got in balloon. */ uint32_t actual; + /* Free page report command id, readonly by guest */ + uint32_t free_page_report_cmd_id; + /* Stores PAGE_POISON if page poisoning is in use */ + uint32_t poison_val; }; #define VIRTIO_BALLOON_S_SWAP_IN 0 /* Amount of memory swapped in */