From patchwork Tue Oct 1 10:01:20 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wei Yang X-Patchwork-Id: 1169812 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.intel.com Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 46jFGs2w0hz9sPd for ; Tue, 1 Oct 2019 20:04:05 +1000 (AEST) Received: from localhost ([::1]:39934 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iFF0k-0001RE-2j for incoming@patchwork.ozlabs.org; Tue, 01 Oct 2019 06:04:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41529) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iFEzq-0001Nt-Ol for qemu-devel@nongnu.org; Tue, 01 Oct 2019 06:03:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1iFEzo-00061n-Kf for qemu-devel@nongnu.org; Tue, 01 Oct 2019 06:03:06 -0400 Received: from mga12.intel.com ([192.55.52.136]:10068) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1iFEzn-00060e-TP for qemu-devel@nongnu.org; Tue, 01 Oct 2019 06:03:04 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 Oct 2019 03:03:00 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,570,1559545200"; d="scan'208";a="195599665" Received: from richard.sh.intel.com (HELO localhost) ([10.239.159.54]) by orsmga006.jf.intel.com with ESMTP; 01 Oct 2019 03:02:59 -0700 From: Wei Yang To: quintela@redhat.com, dgilbert@redhat.com Subject: [PATCH 1/3] migration/postcopy: rename postcopy_ram_enable_notify to postcopy_ram_incoming_setup Date: Tue, 1 Oct 2019 18:01:20 +0800 Message-Id: <20191001100122.17730-2-richardw.yang@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191001100122.17730-1-richardw.yang@linux.intel.com> References: <20191001100122.17730-1-richardw.yang@linux.intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 192.55.52.136 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-devel@nongnu.org, Wei Yang Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Function postcopy_ram_incoming_setup and postcopy_ram_incoming_cleanup is a pair. Rename to make it clear for audience. Signed-off-by: Wei Yang Reviewed-by: Dr. David Alan Gilbert --- migration/postcopy-ram.c | 4 ++-- migration/postcopy-ram.h | 2 +- migration/savevm.c | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c index 1f63e65ed7..b24c4a10c2 100644 --- a/migration/postcopy-ram.c +++ b/migration/postcopy-ram.c @@ -1094,7 +1094,7 @@ retry: return NULL; } -int postcopy_ram_enable_notify(MigrationIncomingState *mis) +int postcopy_ram_incoming_setup(MigrationIncomingState *mis) { /* Open the fd for the kernel to give us userfaults */ mis->userfault_fd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK); @@ -1321,7 +1321,7 @@ int postcopy_request_shared_page(struct PostCopyFD *pcfd, RAMBlock *rb, return -1; } -int postcopy_ram_enable_notify(MigrationIncomingState *mis) +int postcopy_ram_incoming_setup(MigrationIncomingState *mis) { assert(0); return -1; diff --git a/migration/postcopy-ram.h b/migration/postcopy-ram.h index 9c8bd2bae0..d2668cc820 100644 --- a/migration/postcopy-ram.h +++ b/migration/postcopy-ram.h @@ -20,7 +20,7 @@ bool postcopy_ram_supported_by_host(MigrationIncomingState *mis); * Make all of RAM sensitive to accesses to areas that haven't yet been written * and wire up anything necessary to deal with it. */ -int postcopy_ram_enable_notify(MigrationIncomingState *mis); +int postcopy_ram_incoming_setup(MigrationIncomingState *mis); /* * Initialise postcopy-ram, setting the RAM to a state where we can go into diff --git a/migration/savevm.c b/migration/savevm.c index adad938f57..f3292eb003 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -1865,7 +1865,7 @@ static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis) * shouldn't be doing anything yet so don't actually expect requests */ if (migrate_postcopy_ram()) { - if (postcopy_ram_enable_notify(mis)) { + if (postcopy_ram_incoming_setup(mis)) { postcopy_ram_incoming_cleanup(mis); return -1; } From patchwork Tue Oct 1 10:01:21 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wei Yang X-Patchwork-Id: 1169815 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.intel.com Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 46jFKn20l4z9sPd for ; Tue, 1 Oct 2019 20:06:37 +1000 (AEST) Received: from localhost ([::1]:39958 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iFF3C-0003l6-GY for incoming@patchwork.ozlabs.org; Tue, 01 Oct 2019 06:06:34 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41531) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iFEzr-0001OA-1p for qemu-devel@nongnu.org; Tue, 01 Oct 2019 06:03:07 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1iFEzp-00061z-3x for qemu-devel@nongnu.org; Tue, 01 Oct 2019 06:03:06 -0400 Received: from mga12.intel.com ([192.55.52.136]:10068) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1iFEzo-00060e-Rx for qemu-devel@nongnu.org; Tue, 01 Oct 2019 06:03:05 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 Oct 2019 03:03:02 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,570,1559545200"; d="scan'208";a="195599673" Received: from richard.sh.intel.com (HELO localhost) ([10.239.159.54]) by orsmga006.jf.intel.com with ESMTP; 01 Oct 2019 03:03:01 -0700 From: Wei Yang To: quintela@redhat.com, dgilbert@redhat.com Subject: [PATCH 2/3] migration/postcopy: not necessary to do postcopy_ram_incoming_cleanup when state is ADVISE Date: Tue, 1 Oct 2019 18:01:21 +0800 Message-Id: <20191001100122.17730-3-richardw.yang@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191001100122.17730-1-richardw.yang@linux.intel.com> References: <20191001100122.17730-1-richardw.yang@linux.intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 192.55.52.136 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-devel@nongnu.org, Wei Yang Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" postcopy_ram_incoming_cleanup() does cleanup for postcopy_ram_incoming_setup(), while the setup happens only after migration enters LISTEN state. This means there is nothing to cleanup when migration is still ADVISE state. Signed-off-by: Wei Yang --- migration/migration.c | 1 - 1 file changed, 1 deletion(-) diff --git a/migration/migration.c b/migration/migration.c index 5f7e4d15e9..34d5e66f06 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -461,7 +461,6 @@ static void process_incoming_migration_co(void *opaque) * but managed to complete within the precopy period, we can use * the normal exit. */ - postcopy_ram_incoming_cleanup(mis); } else if (ret >= 0) { /* * Postcopy was started, cleanup should happen at the end of the From patchwork Tue Oct 1 10:01:22 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wei Yang X-Patchwork-Id: 1169816 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.intel.com Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 46jFNP5phjz9sPJ for ; Tue, 1 Oct 2019 20:08:53 +1000 (AEST) Received: from localhost ([::1]:39984 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iFF5P-0005wR-DQ for incoming@patchwork.ozlabs.org; Tue, 01 Oct 2019 06:08:51 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41546) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iFEzs-0001PE-Vc for qemu-devel@nongnu.org; Tue, 01 Oct 2019 06:03:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1iFEzq-00062d-Lo for qemu-devel@nongnu.org; Tue, 01 Oct 2019 06:03:08 -0400 Received: from mga12.intel.com ([192.55.52.136]:10068) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1iFEzq-00060e-BF for qemu-devel@nongnu.org; Tue, 01 Oct 2019 06:03:06 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 Oct 2019 03:03:04 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,570,1559545200"; d="scan'208";a="195599679" Received: from richard.sh.intel.com (HELO localhost) ([10.239.159.54]) by orsmga006.jf.intel.com with ESMTP; 01 Oct 2019 03:03:03 -0700 From: Wei Yang To: quintela@redhat.com, dgilbert@redhat.com Subject: [PATCH 3/3] migration/postcopy: handle POSTCOPY_INCOMING_RUNNING corner case properly Date: Tue, 1 Oct 2019 18:01:22 +0800 Message-Id: <20191001100122.17730-4-richardw.yang@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191001100122.17730-1-richardw.yang@linux.intel.com> References: <20191001100122.17730-1-richardw.yang@linux.intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 192.55.52.136 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-devel@nongnu.org, Wei Yang Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Currently, we set PostcopyState blindly to RUNNING, even we found the previous state is not LISTENING. This will lead to a corner case. First let's look at the code flow: qemu_loadvm_state_main() ret = loadvm_process_command() loadvm_postcopy_handle_run() return -1; if (ret < 0) { if (postcopy_state_get() == POSTCOPY_INCOMING_RUNNING) ... } From above snippet, the corner case is loadvm_postcopy_handle_run() always sets state to RUNNING. And then it checks the previous state. If the previous state is not LISTENING, it will return -1. But at this moment, PostcopyState is already been set to RUNNING. Then ret is checked in qemu_loadvm_state_main(), when it is -1 PostcopyState is checked. Current logic would pause postcopy and retry if PostcopyState is RUNNING. This is not what we expect, because postcopy is not active yet. This patch makes sure state is set to RUNNING only previous state is LISTENING by introducing an old_state parameter in postcopy_state_set(). New state only would be set when current state equals to old_state. Signed-off-by: Wei Yang Reviewed-by: Dr. David Alan Gilbert --- migration/migration.c | 2 +- migration/postcopy-ram.c | 13 +++++++++---- migration/postcopy-ram.h | 3 ++- migration/savevm.c | 11 ++++++----- 4 files changed, 18 insertions(+), 11 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 34d5e66f06..369cf3826e 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -447,7 +447,7 @@ static void process_incoming_migration_co(void *opaque) assert(mis->from_src_file); mis->migration_incoming_co = qemu_coroutine_self(); mis->largest_page_size = qemu_ram_pagesize_largest(); - postcopy_state_set(POSTCOPY_INCOMING_NONE); + postcopy_state_set(POSTCOPY_INCOMING_NONE, NULL); migrate_set_state(&mis->state, MIGRATION_STATUS_NONE, MIGRATION_STATUS_ACTIVE); ret = qemu_loadvm_state(mis->from_src_file); diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c index b24c4a10c2..8f741d636d 100644 --- a/migration/postcopy-ram.c +++ b/migration/postcopy-ram.c @@ -577,7 +577,7 @@ int postcopy_ram_incoming_cleanup(MigrationIncomingState *mis) } } - postcopy_state_set(POSTCOPY_INCOMING_END); + postcopy_state_set(POSTCOPY_INCOMING_END, NULL); if (mis->postcopy_tmp_page) { munmap(mis->postcopy_tmp_page, mis->largest_page_size); @@ -626,7 +626,7 @@ int postcopy_ram_prepare_discard(MigrationIncomingState *mis) return -1; } - postcopy_state_set(POSTCOPY_INCOMING_DISCARD); + postcopy_state_set(POSTCOPY_INCOMING_DISCARD, NULL); return 0; } @@ -1457,9 +1457,14 @@ PostcopyState postcopy_state_get(void) } /* Set the state and return the old state */ -PostcopyState postcopy_state_set(PostcopyState new_state) +PostcopyState postcopy_state_set(PostcopyState new_state, + const PostcopyState *old_state) { - return atomic_xchg(&incoming_postcopy_state, new_state); + if (!old_state) { + return atomic_xchg(&incoming_postcopy_state, new_state); + } else { + return atomic_cmpxchg(&incoming_postcopy_state, *old_state, new_state); + } } /* Register a handler for external shared memory postcopy diff --git a/migration/postcopy-ram.h b/migration/postcopy-ram.h index d2668cc820..e3dde32155 100644 --- a/migration/postcopy-ram.h +++ b/migration/postcopy-ram.h @@ -109,7 +109,8 @@ void *postcopy_get_tmp_page(MigrationIncomingState *mis); PostcopyState postcopy_state_get(void); /* Set the state and return the old state */ -PostcopyState postcopy_state_set(PostcopyState new_state); +PostcopyState postcopy_state_set(PostcopyState new_state, + const PostcopyState *old_state); void postcopy_fault_thread_notify(MigrationIncomingState *mis); diff --git a/migration/savevm.c b/migration/savevm.c index f3292eb003..45474d9c95 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -1599,7 +1599,7 @@ enum LoadVMExitCodes { static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis, uint16_t len) { - PostcopyState ps = postcopy_state_set(POSTCOPY_INCOMING_ADVISE); + PostcopyState ps = postcopy_state_set(POSTCOPY_INCOMING_ADVISE, NULL); uint64_t remote_pagesize_summary, local_pagesize_summary, remote_tps; Error *local_err = NULL; @@ -1628,7 +1628,7 @@ static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis, } if (!postcopy_ram_supported_by_host(mis)) { - postcopy_state_set(POSTCOPY_INCOMING_NONE); + postcopy_state_set(POSTCOPY_INCOMING_NONE, NULL); return -1; } @@ -1841,7 +1841,7 @@ static void *postcopy_ram_listen_thread(void *opaque) /* After this message we must be able to immediately receive postcopy data */ static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis) { - PostcopyState ps = postcopy_state_set(POSTCOPY_INCOMING_LISTENING); + PostcopyState ps = postcopy_state_set(POSTCOPY_INCOMING_LISTENING, NULL); trace_loadvm_postcopy_handle_listen(); Error *local_err = NULL; @@ -1929,10 +1929,11 @@ static void loadvm_postcopy_handle_run_bh(void *opaque) /* After all discards we can start running and asking for pages */ static int loadvm_postcopy_handle_run(MigrationIncomingState *mis) { - PostcopyState ps = postcopy_state_set(POSTCOPY_INCOMING_RUNNING); + PostcopyState old_ps = POSTCOPY_INCOMING_LISTENING; + PostcopyState ps = postcopy_state_set(POSTCOPY_INCOMING_RUNNING, &old_ps); trace_loadvm_postcopy_handle_run(); - if (ps != POSTCOPY_INCOMING_LISTENING) { + if (ps != old_ps) { error_report("CMD_POSTCOPY_RUN in wrong postcopy state (%d)", ps); return -1; }