From patchwork Wed Nov 10 08:37:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lei Rao X-Patchwork-Id: 1553308 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4Hpz3N6Jclz9s5P for ; Wed, 10 Nov 2021 19:46:24 +1100 (AEDT) Received: from localhost ([::1]:51552 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mkjFO-0001IW-MV for incoming@patchwork.ozlabs.org; Wed, 10 Nov 2021 03:46:22 -0500 Received: from eggs.gnu.org ([209.51.188.92]:37700) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkjF5-0001Fe-TQ for qemu-devel@nongnu.org; Wed, 10 Nov 2021 03:46:03 -0500 Received: from mga18.intel.com ([134.134.136.126]:44316) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkjF4-0004rP-48 for qemu-devel@nongnu.org; Wed, 10 Nov 2021 03:46:03 -0500 X-IronPort-AV: E=McAfee;i="6200,9189,10163"; a="219524185" X-IronPort-AV: E=Sophos;i="5.87,223,1631602800"; d="scan'208";a="219524185" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Nov 2021 00:45:48 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.87,223,1631602800"; d="scan'208";a="533980311" Received: from unknown (HELO localhost.localdomain.bj.intel.com) ([10.238.156.105]) by orsmga001.jf.intel.com with ESMTP; 10 Nov 2021 00:45:46 -0800 From: "Rao, Lei" To: chen.zhang@intel.com, zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com Subject: [PATCH 1/2] Fixed a QEMU hang when guest poweroff in COLO mode Date: Wed, 10 Nov 2021 16:37:35 +0800 Message-Id: <1636533456-5374-1-git-send-email-lei.rao@intel.com> X-Mailer: git-send-email 1.8.3.1 Received-SPF: pass client-ip=134.134.136.126; envelope-from=lei.rao@intel.com; helo=mga18.intel.com X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Rao, Lei" , qemu-devel@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" From: "Rao, Lei" When the PVM guest poweroff, the COLO thread may wait a semaphore in colo_process_checkpoint().So, we should wake up the COLO thread before migration shutdown. Signed-off-by: Lei Rao --- include/migration/colo.h | 1 + migration/colo.c | 14 ++++++++++++++ migration/migration.c | 10 ++++++++++ 3 files changed, 25 insertions(+) diff --git a/include/migration/colo.h b/include/migration/colo.h index 768e1f0..525b45a 100644 --- a/include/migration/colo.h +++ b/include/migration/colo.h @@ -37,4 +37,5 @@ COLOMode get_colo_mode(void); void colo_do_failover(void); void colo_checkpoint_notify(void *opaque); +void colo_shutdown(COLOMode mode); #endif diff --git a/migration/colo.c b/migration/colo.c index 2415325..385c1d7 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -820,6 +820,20 @@ static void colo_wait_handle_message(MigrationIncomingState *mis, } } +void colo_shutdown(COLOMode mode) +{ + if (mode == COLO_MODE_PRIMARY) { + MigrationState *s = migrate_get_current(); + + qemu_event_set(&s->colo_checkpoint_event); + qemu_sem_post(&s->colo_exit_sem); + } else { + MigrationIncomingState *mis = migration_incoming_get_current(); + + qemu_sem_post(&mis->colo_incoming_sem); + } +} + void *colo_process_incoming_thread(void *opaque) { MigrationIncomingState *mis = opaque; diff --git a/migration/migration.c b/migration/migration.c index abaf6f9..9df6328 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -225,6 +225,16 @@ void migration_cancel(const Error *error) void migration_shutdown(void) { + COLOMode mode = get_colo_mode(); + + /* + * When the QEMU main thread exit, the COLO thread + * may wait a semaphore. So, we should wakeup the + * COLO thread before migration shutdown. + */ + if (mode != COLO_MODE_NONE) { + colo_shutdown(mode); + } /* * Cancel the current migration - that will (eventually) * stop the migration using this structure From patchwork Wed Nov 10 08:37:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lei Rao X-Patchwork-Id: 1553307 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4Hpz3M4qVNz9s5P for ; Wed, 10 Nov 2021 19:46:22 +1100 (AEDT) Received: from localhost ([::1]:51398 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mkjFK-0001Bw-FB for incoming@patchwork.ozlabs.org; Wed, 10 Nov 2021 03:46:18 -0500 Received: from eggs.gnu.org ([209.51.188.92]:37664) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkjF1-0001Bm-Ow for qemu-devel@nongnu.org; Wed, 10 Nov 2021 03:45:59 -0500 Received: from mga06.intel.com ([134.134.136.31]:28905) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkjEy-0004ri-Vg for qemu-devel@nongnu.org; Wed, 10 Nov 2021 03:45:59 -0500 X-IronPort-AV: E=McAfee;i="6200,9189,10163"; a="293459868" X-IronPort-AV: E=Sophos;i="5.87,223,1631602800"; d="scan'208";a="293459868" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Nov 2021 00:45:54 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.87,223,1631602800"; d="scan'208";a="533980321" Received: from unknown (HELO localhost.localdomain.bj.intel.com) ([10.238.156.105]) by orsmga001.jf.intel.com with ESMTP; 10 Nov 2021 00:45:51 -0800 From: "Rao, Lei" To: chen.zhang@intel.com, zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com Subject: [PATCH 2/2] migration/ram.c: Remove the qemu_mutex_lock in colo_flush_ram_cache. Date: Wed, 10 Nov 2021 16:37:36 +0800 Message-Id: <1636533456-5374-2-git-send-email-lei.rao@intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1636533456-5374-1-git-send-email-lei.rao@intel.com> References: <1636533456-5374-1-git-send-email-lei.rao@intel.com> Received-SPF: pass client-ip=134.134.136.31; envelope-from=lei.rao@intel.com; helo=mga06.intel.com X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Rao, Lei" , qemu-devel@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" From: "Rao, Lei" The code to acquire bitmap_mutex is added in the commit of "63268c4970a5f126cc9af75f3ccb8057abef5ec0". There is no need to acquire bitmap_mutex in colo_flush_ram_cache(). This is because the colo_flush_ram_cache only be called on the COLO secondary VM, which is the destination side. On the COLO secondary VM, only the COLO thread will touch the bitmap of ram cache. Signed-off-by: Lei Rao Reviewed-by: Juan Quintela Reviewed-by: Zhang Chen --- migration/ram.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index 863035d..2c688f5 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -3918,7 +3918,6 @@ void colo_flush_ram_cache(void) unsigned long offset = 0; memory_global_dirty_log_sync(); - qemu_mutex_lock(&ram_state->bitmap_mutex); WITH_RCU_READ_LOCK_GUARD() { RAMBLOCK_FOREACH_NOT_IGNORED(block) { ramblock_sync_dirty_bitmap(ram_state, block); @@ -3954,7 +3953,6 @@ void colo_flush_ram_cache(void) } } trace_colo_flush_ram_cache_end(); - qemu_mutex_unlock(&ram_state->bitmap_mutex); } /**