From patchwork Fri Feb 15 17:46:50 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 220818 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 068FC2C007C for ; Sat, 16 Feb 2013 06:11:40 +1100 (EST) Received: from localhost ([::1]:53790 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1U6POx-0006zL-BB for incoming@patchwork.ozlabs.org; Fri, 15 Feb 2013 12:48:47 -0500 Received: from eggs.gnu.org ([208.118.235.92]:51230) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1U6POZ-0006mR-CS for qemu-devel@nongnu.org; Fri, 15 Feb 2013 12:48:32 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1U6POJ-0004Lk-Te for qemu-devel@nongnu.org; Fri, 15 Feb 2013 12:48:23 -0500 Received: from mail-ve0-f182.google.com ([209.85.128.182]:36267) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1U6POJ-0004KY-Jp for qemu-devel@nongnu.org; Fri, 15 Feb 2013 12:48:07 -0500 Received: by mail-ve0-f182.google.com with SMTP id ox1so3323052veb.27 for ; Fri, 15 Feb 2013 09:48:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:sender:from:to:cc:subject:date:message-id:x-mailer :in-reply-to:references; bh=eQeASPiKo1PDXoMPqM4Cp3s8WNonkH51MTEYjzmutWw=; b=gz21emEYMA1WH+yJARKeU9bu0vTKdY3h6y0oUdwt7p7a1ZZxO7LGVK88k/W1iImQTi /K+eBTUYENrPXdTN1I/igRQg19BV5t4NLCZy364iStpQNyomgzsKQf3FP9iXA/m+3bO9 wCCNP0YLHd7rxXOCE2+BQaW9CKxn32VbO/8GjAjdIAGUZ9Y88ueYnwSJc4HoeRYcS05B PGhTsqVjvW6i0pau5PYEywT6AsRANnQLKtAhDloiq4XBSeg+rzgorpEDvLc1V5DFEMm6 VdnUmh7sCKbqq6HSpoeVA1GTgMsveFbvWvPl3O3RsgtnF/GksBy4r2bLsIrTg0/p15Qv RnFg== X-Received: by 10.58.247.132 with SMTP id ye4mr4603379vec.9.1360950481885; Fri, 15 Feb 2013 09:48:01 -0800 (PST) Received: from yakj.usersys.redhat.com (93-34-179-137.ip50.fastwebnet.it. [93.34.179.137]) by mx.google.com with ESMTPS id yu12sm65402142vec.6.2013.02.15.09.47.59 (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Fri, 15 Feb 2013 09:48:01 -0800 (PST) From: Paolo Bonzini To: qemu-devel@nongnu.org Date: Fri, 15 Feb 2013 18:46:50 +0100 Message-Id: <1360950433-17106-19-git-send-email-pbonzini@redhat.com> X-Mailer: git-send-email 1.8.1.2 In-Reply-To: <1360950433-17106-1-git-send-email-pbonzini@redhat.com> References: <1360950433-17106-1-git-send-email-pbonzini@redhat.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [fuzzy] X-Received-From: 209.85.128.182 Cc: owasserm@redhat.com, chegu_vinod@hp.com, quintela@redhat.com Subject: [Qemu-devel] [PATCH 18/41] block-migration: add lock X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Some state is shared between the block migration code and its AIO callbacks. Once block migration will run outside the iothread, the block migration code and the AIO callbacks will be able to run concurrently. Protect the critical sections with a separate lock. Do the same for completed_sectors, which can be used from the monitor. Signed-off-by: Paolo Bonzini Reviewed-by: Juan Quintela --- block-migration.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++-- include/qemu/atomic.h | 1 + 2 files changed, 52 insertions(+), 3 deletions(-) diff --git a/block-migration.c b/block-migration.c index 21842b2..ea99163 100644 --- a/block-migration.c +++ b/block-migration.c @@ -54,7 +54,7 @@ typedef struct BlkMigDevState { int64_t cur_sector; int64_t cur_dirty; - /* Protected by iothread lock. */ + /* Protected by block migration lock. */ unsigned long *aio_bitmap; int64_t completed_sectors; } BlkMigDevState; @@ -69,7 +69,7 @@ typedef struct BlkMigBlock { QEMUIOVector qiov; BlockDriverAIOCB *aiocb; - /* Protected by iothread lock. */ + /* Protected by block migration lock. */ int ret; QSIMPLEQ_ENTRY(BlkMigBlock) entry; } BlkMigBlock; @@ -81,7 +81,7 @@ typedef struct BlkMigState { QSIMPLEQ_HEAD(bmds_list, BlkMigDevState) bmds_list; int64_t total_sector_sum; - /* Protected by iothread lock. */ + /* Protected by lock. */ QSIMPLEQ_HEAD(blk_list, BlkMigBlock) blk_list; int submitted; int read_done; @@ -90,10 +90,23 @@ typedef struct BlkMigState { int transferred; int prev_progress; int bulk_completed; + + /* Lock must be taken _inside_ the iothread lock. */ + QemuMutex lock; } BlkMigState; static BlkMigState block_mig_state; +static void blk_mig_lock(void) +{ + qemu_mutex_lock(&block_mig_state.lock); +} + +static void blk_mig_unlock(void) +{ + qemu_mutex_unlock(&block_mig_state.lock); +} + static void blk_send(QEMUFile *f, BlkMigBlock * blk) { int len; @@ -120,9 +133,11 @@ uint64_t blk_mig_bytes_transferred(void) BlkMigDevState *bmds; uint64_t sum = 0; + blk_mig_lock(); QSIMPLEQ_FOREACH(bmds, &block_mig_state.bmds_list, entry) { sum += bmds->completed_sectors; } + blk_mig_unlock(); return sum << BDRV_SECTOR_BITS; } @@ -142,6 +157,9 @@ uint64_t blk_mig_bytes_total(void) return sum << BDRV_SECTOR_BITS; } + +/* Called with migration lock held. */ + static int bmds_aio_inflight(BlkMigDevState *bmds, int64_t sector) { int64_t chunk = sector / (int64_t)BDRV_SECTORS_PER_DIRTY_CHUNK; @@ -154,6 +172,8 @@ static int bmds_aio_inflight(BlkMigDevState *bmds, int64_t sector) } } +/* Called with migration lock held. */ + static void bmds_set_aio_inflight(BlkMigDevState *bmds, int64_t sector_num, int nb_sectors, int set) { @@ -188,10 +208,13 @@ static void alloc_aio_bitmap(BlkMigDevState *bmds) bmds->aio_bitmap = g_malloc0(bitmap_size); } +/* Never hold migration lock when yielding to the main loop! */ + static void blk_mig_read_cb(void *opaque, int ret) { BlkMigBlock *blk = opaque; + blk_mig_lock(); blk->ret = ret; QSIMPLEQ_INSERT_TAIL(&block_mig_state.blk_list, blk, entry); @@ -199,6 +222,7 @@ static void blk_mig_read_cb(void *opaque, int ret) block_mig_state.submitted--; block_mig_state.read_done++; + blk_mig_unlock(); assert(block_mig_state.submitted >= 0); } @@ -244,7 +268,9 @@ static int mig_save_device_bulk(QEMUFile *f, BlkMigDevState *bmds) blk->iov.iov_len = nr_sectors * BDRV_SECTOR_SIZE; qemu_iovec_init_external(&blk->qiov, &blk->iov, 1); + blk_mig_lock(); block_mig_state.submitted++; + blk_mig_unlock(); blk->aiocb = bdrv_aio_readv(bs, cur_sector, &blk->qiov, nr_sectors, blk_mig_read_cb, blk); @@ -366,8 +392,12 @@ static int mig_save_device_dirty(QEMUFile *f, BlkMigDevState *bmds, int ret = -EIO; for (sector = bmds->cur_dirty; sector < bmds->total_sectors;) { + blk_mig_lock(); if (bmds_aio_inflight(bmds, sector)) { + blk_mig_unlock(); bdrv_drain_all(); + } else { + blk_mig_unlock(); } if (bdrv_get_dirty(bmds->bs, sector)) { @@ -389,8 +419,11 @@ static int mig_save_device_dirty(QEMUFile *f, BlkMigDevState *bmds, blk->aiocb = bdrv_aio_readv(bmds->bs, sector, &blk->qiov, nr_sectors, blk_mig_read_cb, blk); + + blk_mig_lock(); block_mig_state.submitted++; bmds_set_aio_inflight(bmds, sector, nr_sectors, 1); + blk_mig_unlock(); } else { ret = bdrv_read(bmds->bs, sector, blk->buf, nr_sectors); if (ret < 0) { @@ -446,6 +479,7 @@ static int flush_blks(QEMUFile *f) __FUNCTION__, block_mig_state.submitted, block_mig_state.read_done, block_mig_state.transferred); + blk_mig_lock(); while ((blk = QSIMPLEQ_FIRST(&block_mig_state.blk_list)) != NULL) { if (qemu_file_rate_limit(f)) { break; @@ -456,7 +490,9 @@ static int flush_blks(QEMUFile *f) } QSIMPLEQ_REMOVE_HEAD(&block_mig_state.blk_list, entry); + blk_mig_unlock(); blk_send(f, blk); + blk_mig_lock(); g_free(blk->buf); g_free(blk); @@ -465,6 +501,7 @@ static int flush_blks(QEMUFile *f) block_mig_state.transferred++; assert(block_mig_state.read_done >= 0); } + blk_mig_unlock(); DPRINTF("%s Exit submitted %d read_done %d transferred %d\n", __FUNCTION__, block_mig_state.submitted, block_mig_state.read_done, @@ -493,6 +530,7 @@ static void blk_mig_cleanup(void) set_dirty_tracking(0); + blk_mig_lock(); while ((bmds = QSIMPLEQ_FIRST(&block_mig_state.bmds_list)) != NULL) { QSIMPLEQ_REMOVE_HEAD(&block_mig_state.bmds_list, entry); bdrv_set_in_use(bmds->bs, 0); @@ -506,6 +544,7 @@ static void blk_mig_cleanup(void) g_free(blk->buf); g_free(blk); } + blk_mig_unlock(); } static void block_migration_cancel(void *opaque) @@ -548,9 +587,11 @@ static int block_save_iterate(QEMUFile *f, void *opaque) blk_mig_reset_dirty_cursor(); /* control the rate of transfer */ + blk_mig_lock(); while ((block_mig_state.submitted + block_mig_state.read_done) * BLOCK_SIZE < qemu_file_get_rate_limit(f)) { + blk_mig_unlock(); if (block_mig_state.bulk_completed == 0) { /* first finish the bulk phase */ if (blk_mig_save_bulked_block(f) == 0) { @@ -564,11 +605,13 @@ static int block_save_iterate(QEMUFile *f, void *opaque) if (ret < 0) { return ret; } + blk_mig_lock(); if (ret != 0) { /* no more dirty blocks */ break; } } + blk_mig_unlock(); ret = flush_blks(f); if (ret) { @@ -595,7 +638,9 @@ static int block_save_complete(QEMUFile *f, void *opaque) /* we know for sure that save bulk is completed and all async read completed */ + blk_mig_lock(); assert(block_mig_state.submitted == 0); + blk_mig_unlock(); do { ret = blk_mig_save_dirty_block(f, 0); @@ -620,6 +665,7 @@ static uint64_t block_save_pending(QEMUFile *f, void *opaque, uint64_t max_size) /* Estimate pending number of bytes to send */ uint64_t pending; + blk_mig_lock(); pending = get_remaining_dirty() + block_mig_state.submitted * BLOCK_SIZE + block_mig_state.read_done * BLOCK_SIZE; @@ -628,6 +674,7 @@ static uint64_t block_save_pending(QEMUFile *f, void *opaque, uint64_t max_size) if (pending == 0 && !block_mig_state.bulk_completed) { pending = BLOCK_SIZE; } + blk_mig_unlock(); DPRINTF("Enter save live pending %" PRIu64 "\n", pending); return pending; @@ -739,6 +786,7 @@ void blk_mig_init(void) { QSIMPLEQ_INIT(&block_mig_state.bmds_list); QSIMPLEQ_INIT(&block_mig_state.blk_list); + qemu_mutex_init(&block_mig_state.lock); register_savevm_live(NULL, "block", 0, 1, &savevm_block_handlers, &block_mig_state); diff --git a/include/qemu/atomic.h b/include/qemu/atomic.h index 96a194b..10becb6 100644 --- a/include/qemu/atomic.h +++ b/include/qemu/atomic.h @@ -16,6 +16,7 @@ */ #define smp_wmb() barrier() #define smp_rmb() barrier() + /* * We use GCC builtin if it's available, as that can use * mfence on 32 bit as well, e.g. if built with -march=pentium-m.