From patchwork Thu Sep 10 10:24:30 2009 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Liran Schour X-Patchwork-Id: 33310 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [199.232.76.165]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by bilbo.ozlabs.org (Postfix) with ESMTPS id 9A1D4B7257 for ; Thu, 10 Sep 2009 21:12:58 +1000 (EST) Received: from localhost ([127.0.0.1]:36124 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MlhaA-00075Y-78 for incoming@patchwork.ozlabs.org; Thu, 10 Sep 2009 07:12:54 -0400 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1MlgPt-00050M-PR for qemu-devel@nongnu.org; Thu, 10 Sep 2009 05:58:13 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1MlgPp-0004yw-MW for qemu-devel@nongnu.org; Thu, 10 Sep 2009 05:58:13 -0400 Received: from [199.232.76.173] (port=44783 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MlgPp-0004ys-7u for qemu-devel@nongnu.org; Thu, 10 Sep 2009 05:58:09 -0400 Received: from mtagate2.de.ibm.com ([195.212.17.162]:36022) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1MlgPo-0007zx-2q for qemu-devel@nongnu.org; Thu, 10 Sep 2009 05:58:08 -0400 Received: from d12nrmr1607.megacenter.de.ibm.com (d12nrmr1607.megacenter.de.ibm.com [9.149.167.49]) by mtagate2.de.ibm.com (8.13.1/8.13.1) with ESMTP id n8A9w6KJ026864 for ; Thu, 10 Sep 2009 09:58:06 GMT Received: from d12av02.megacenter.de.ibm.com (d12av02.megacenter.de.ibm.com [9.149.165.228]) by d12nrmr1607.megacenter.de.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id n8A9w5A02924672 for ; Thu, 10 Sep 2009 11:58:06 +0200 Received: from d12av02.megacenter.de.ibm.com (loopback [127.0.0.1]) by d12av02.megacenter.de.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id n8A9w5gT027608 for ; Thu, 10 Sep 2009 11:58:05 +0200 Received: from localhost.localdomain (im4-64s.haifa.ibm.com [9.148.27.41]) by d12av02.megacenter.de.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id n8A9w58D027592 for ; Thu, 10 Sep 2009 11:58:05 +0200 From: lirans@il.ibm.com To: qemu-devel@nongnu.org Date: Thu, 10 Sep 2009 13:24:30 +0300 Message-Id: <12525782702819-git-send-email-lirans@il.ibm.com> X-Mailer: git-send-email 1.5.2.4 X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6, seldom 2.4 (older, 4) Subject: [Qemu-devel] [PATCH 2/3] Block live migration X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org This patch introduces block migration called during live migration. Block are being copied to the destination in an async way. First the code will transfer the whole disk and then transfer all dirty blocks accumulted during the migration. Still need to improve transition from the iterative phase of migration to the end phase. For now transition will take place when all blocks transfered once, all the dirty blocks will be transfered during the end phase (guest is suspended). Improved transfer rate, now block migration will try to transfer blocks according to connection badwidth. diff --git a/Makefile b/Makefile index 035bbbc..ee79f8c 100644 --- a/Makefile +++ b/Makefile @@ -126,6 +126,7 @@ OBJS+=buffered_file.o migration.o migration-tcp.o net.o qemu-sockets.o OBJS+=qemu-char.o aio.o net-checksum.o savevm.o cache-utils.o OBJS+=msmouse.o ps2.o OBJS+=qdev.o ssi.o +OBJS+=block-migration.o ifdef CONFIG_BRLAPI OBJS+= baum.o diff --git a/block-migration.c b/block-migration.c new file mode 100644 index 0000000..b29fb52 --- /dev/null +++ b/block-migration.c @@ -0,0 +1,633 @@ +/* + * QEMU live migration + * + * Copyright IBM, Corp. 2009 + * + * Authors: + * Liran Schour + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + */ + +#include "qemu-common.h" +#include "block_int.h" +#include "hw/hw.h" +#include "qemu-timer.h" +#include "block-migration.h" +#include +#include + +#define SECTOR_BITS 9 +#define SECTOR_SIZE (1 << SECTOR_BITS) +#define SECTOR_MASK ~(SECTOR_SIZE - 1); + +#define SECTORS_PER_BLOCK 8 +#define BLOCK_SIZE (SECTORS_PER_BLOCK << SECTOR_BITS) + +#define BLK_MIG_FLAG_DEVICE_BLOCK 0x01 +#define BLK_MIG_FLAG_EOS 0x02 + +#define MAX_IS_ALLOCATED_SEARCH 65536 +#define MAX_BLOCKS_READ 10000 +#define BLOCKS_READ_CHANGE 100 +#define INITIAL_BLOCKS_READ 100 + +//#define DEBUG_BLK_MIGRATION + +#ifdef DEBUG_BLK_MIGRATION +#define dprintf(fmt, ...) \ + do { printf("blk_migration: " fmt, ## __VA_ARGS__); } while (0) +#else +#define dprintf(fmt, ...) \ + do { } while (0) +#endif + +typedef struct BlkMigBlock { + uint8_t buf[BLOCK_SIZE]; + BlkMigDevState *bmds; + int64_t sector; + struct iovec iov; + QEMUIOVector qiov; + BlockDriverAIOCB *aiocb; + int ret; + struct BlkMigBlock *next; +} BlkMigBlock; + +typedef struct BlkMigState { + int bulk_completed; + int blk_enable; + int shared_base; + int no_dirty; + QEMUFile *load_file; + BlkMigDevState *bmds_first; + QEMUTimer *timer; +} BlkMigState; + +static BlkMigState block_mig_state; + +static BlkMigBlock *first_blk = NULL; +static BlkMigBlock *last_blk = NULL; + +static int submitted = 0; +static int read_done = 0; +static int transferred = 0; + +static int64_t print_completion = 0; +static void mark_clean(BlkMigDevState *bmds, int64_t sector, + int sector_num); +static int is_dirty(BlkMigDevState *bmds, int64_t sector); + +static void blk_mig_read_cb(void *opaque, int ret) +{ + BlkMigBlock *blk = opaque; + + blk->ret = ret; + + /* insert at the end */ + if(last_blk == NULL) { + first_blk = last_blk = blk; + } else { + last_blk->next = blk; + last_blk = blk; + } + + submitted--; + read_done++; + assert(submitted >= 0); + + return; +} + +static int mig_read_device_bulk(QEMUFile *f, BlkMigDevState *bms) +{ + int nr_sectors; + int64_t total_sectors, cur_sector = 0; + BlockDriverState *bs = bms->bs; + BlkMigBlock *blk; + + blk = qemu_malloc(sizeof(BlkMigBlock)); + + cur_sector = bms->cur_sector; + total_sectors = bdrv_getlength(bs) >> SECTOR_BITS; + + if(bms->shared_base) { + while(cur_sector < bms->total_sectors && !is_dirty(bms, cur_sector)) { + cur_sector++; + } + } + + if(cur_sector >= total_sectors) { + bms->cur_sector = total_sectors; + qemu_free(blk); + return 1; + } + + if(cur_sector >= print_completion) { + printf("Completed %" PRId64 " %%\r", cur_sector * 100 / total_sectors); + fflush(stdout); + print_completion += (SECTORS_PER_BLOCK * 10000); + } + + /* we going to transfder BLOCK_SIZE any way even if it is not allocated */ + nr_sectors = SECTORS_PER_BLOCK; + + if(total_sectors - cur_sector < SECTORS_PER_BLOCK) { + nr_sectors = (total_sectors - cur_sector); + } + + bms->cur_sector = cur_sector + nr_sectors; + blk->sector = cur_sector; + blk->bmds = bms; + blk->next = NULL; + + blk->iov.iov_base = blk->buf; + blk->iov.iov_len = nr_sectors * SECTOR_SIZE; + qemu_iovec_init_external(&blk->qiov, &blk->iov, 1); + + blk->aiocb = bdrv_aio_readv(bs, cur_sector, &blk->qiov, + nr_sectors, blk_mig_read_cb, blk); + + if(!blk->aiocb) { + printf("Error reading sector %" PRId64 "\n", cur_sector); + qemu_free(blk); + return 0; + } + + mark_clean(bms, cur_sector, nr_sectors); + submitted++; + + return (bms->cur_sector >= total_sectors); +} + +static int mig_save_device_bulk(QEMUFile *f, BlkMigDevState *bmds) +{ + int len, nr_sectors; + int64_t total_sectors = bmds->total_sectors, cur_sector = 0; + uint8_t * tmp_buf = NULL; + BlockDriverState *bs = bmds->bs; + + tmp_buf = qemu_malloc(BLOCK_SIZE); + + cur_sector = bmds->cur_sector; + + if(bmds->shared_base) { + while(cur_sector < bmds->total_sectors && !is_dirty(bmds, cur_sector)) { + cur_sector++; + } + } + + if(cur_sector >= total_sectors) { + bmds->cur_sector = total_sectors; + qemu_free(tmp_buf); + return 1; + } + + if(cur_sector >= print_completion) { + printf("Completed %" PRId64 " %%\r", cur_sector * 100 / total_sectors); + fflush(stdout); + print_completion += (SECTORS_PER_BLOCK * 10000); + } + + /* we going to transfer BLOCK_SIZE any way even if it is not allocated */ + nr_sectors = SECTORS_PER_BLOCK; + + if(total_sectors - cur_sector < SECTORS_PER_BLOCK) { + nr_sectors = (total_sectors - cur_sector); + } + + if(bdrv_read(bs, cur_sector, tmp_buf, nr_sectors) < 0) { + printf("Error reading sector %" PRId64 "\n", cur_sector); + } + + mark_clean(bmds, cur_sector, nr_sectors); + + /* Device name */ + qemu_put_be64(f,(cur_sector << SECTOR_BITS) | BLK_MIG_FLAG_DEVICE_BLOCK); + + len = strlen(bs->device_name); + qemu_put_byte(f, len); + qemu_put_buffer(f, (uint8_t *)bs->device_name, len); + + qemu_put_buffer(f, tmp_buf, BLOCK_SIZE); + + bmds->cur_sector = cur_sector + SECTORS_PER_BLOCK; + + qemu_free(tmp_buf); + + return (bmds->cur_sector >= total_sectors); +} + +static void send_blk(QEMUFile *f, BlkMigBlock * blk) +{ + int len; + + /* Device name */ + qemu_put_be64(f,(blk->sector << SECTOR_BITS) | BLK_MIG_FLAG_DEVICE_BLOCK); + + len = strlen(blk->bmds->bs->device_name); + qemu_put_byte(f, len); + qemu_put_buffer(f, (uint8_t *)blk->bmds->bs->device_name, len); + + qemu_put_buffer(f, blk->buf, BLOCK_SIZE); + + return; +} + +static void blk_mig_save_dev_info(QEMUFile *f, BlkMigDevState *bmds) +{ +} + +static void create_bitmap(BlkMigDevState *bmds) +{ + int64_t cur_sector = 0; + int nr_sectors, count; + + bmds->bitmap = qemu_malloc(bmds->total_sectors); + memset(bmds->bitmap, 1, bmds->total_sectors); + bmds->dirty = bmds->total_sectors; + + if(bmds->shared_base) { + for(cur_sector = 0; cur_sector < bmds->total_sectors;) { + if(cur_sector + MAX_IS_ALLOCATED_SEARCH >= bmds->total_sectors) { + count = bmds->total_sectors - cur_sector; + } else { + count = MAX_IS_ALLOCATED_SEARCH; + } + if(bdrv_is_allocated(bmds->bs, cur_sector, + count, &nr_sectors) == 0) { + mark_clean(bmds, cur_sector, nr_sectors); + } + + cur_sector += nr_sectors; + } + } + + return; +} + +static void init_blk_migration(QEMUFile *f) +{ + BlkMigDevState **pbmds, *bmds; + BlockDriverState *bs; + + for (bs = bdrv_first; bs != NULL; bs = bs->next) { + if(bs->type == BDRV_TYPE_HD) { + bmds = qemu_mallocz(sizeof(BlkMigDevState)); + bmds->bs = bs; + bmds->bulk_completed = 0; + bmds->total_sectors = bdrv_getlength(bs) >> SECTOR_BITS; + bmds->shared_base = block_mig_state.shared_base; + bs->dirty_control = bmds; + create_bitmap(bmds); + + if(bmds->bitmap == NULL) { + printf("Error allocating bitmap\n"); + } + + if(bmds->shared_base) { + printf("Start migration for %s with shared base image\n", bs->device_name); + } else { + printf("Start full migration for %s\n", bs->device_name); + } + + /* insert at the end */ + pbmds = &block_mig_state.bmds_first; + while (*pbmds != NULL) + pbmds = &(*pbmds)->next; + *pbmds = bmds; + + blk_mig_save_dev_info(f, bmds); + + } + } + + return; +} + +static int blk_mig_save_bulked_block(QEMUFile *f, int is_async) +{ + BlkMigDevState *bmds; + + for (bmds = block_mig_state.bmds_first; bmds != NULL; bmds = bmds->next) { + if(bmds->bulk_completed == 0) { + if(is_async) { + if(mig_read_device_bulk(f, bmds) == 1) { + /* completed bulk section for this device */ + bmds->bulk_completed = 1; + } + } else { + if(mig_save_device_bulk(f,bmds) == 1) { + /* completed bulk section for this device */ + bmds->bulk_completed = 1; + } + } + return 1; + } + } + + /* we reached here means bulk is completed */ + block_mig_state.bulk_completed = 1; + + return 0; + +} + +#define MAX_NUM_BLOCKS 4 + +static void blk_mig_save_dirty_blocks(QEMUFile *f) +{ + BlkMigDevState *bmds; + uint8_t buf[BLOCK_SIZE]; + int64_t sector; + int len; + + for(bmds = block_mig_state.bmds_first; bmds != NULL; bmds = bmds->next) { + for(sector = 0; sector < bmds->cur_sector;) { + if(is_dirty(bmds,sector)) { + if(bdrv_read(bmds->bs, sector, buf, SECTORS_PER_BLOCK) < 0) { + printf("error reading sector %" PRId64 " %d\n", + sector, SECTORS_PER_BLOCK); + } + + /* device name */ + qemu_put_be64(f,(sector << SECTOR_BITS) | BLK_MIG_FLAG_DEVICE_BLOCK); + + len = strlen(bmds->bs->device_name); + + qemu_put_byte(f, len); + qemu_put_buffer(f, (uint8_t *)bmds->bs->device_name, len); + + qemu_put_buffer(f, buf, BLOCK_SIZE); + + mark_clean(bmds, sector, SECTORS_PER_BLOCK); + + sector += SECTORS_PER_BLOCK; + } else { + /* sector is clean */ + sector++; + } + } + } + + return; +} + +static void flush_blks(QEMUFile* f) +{ + BlkMigBlock *blk, *tmp; + + dprintf("%s Enter submitted %d read_done %d transfered\n", __FUNCTION__, + submitted, read_done, transfered); + + for(blk = first_blk; blk != NULL && !qemu_file_rate_limit(f); blk = tmp) { + send_blk(f, blk); + + tmp = blk->next; + qemu_free(blk); + + read_done--; + transferred++; + assert(read_done >= 0); + } + first_blk = blk; + + if(first_blk == NULL) { + last_blk = NULL; + } + + dprintf("%s Exit submitted %d read_done %d transferred%d\n", __FUNCTION__, + submitted, read_done, transferred); + + return; +} + +static int is_dirty(BlkMigDevState *bmds, int64_t sector) +{ + return bmds->bitmap[sector]; +} +int64_t dirty_start, count_d; + +static void mark_clean(BlkMigDevState *bmds, int64_t sector, int sector_num) +{ + int i; + + for(i = 0; i < sector_num; i++) { + if(bmds->bitmap[sector + i] == 1) { + bmds->bitmap[sector + i] = 0; + bmds->dirty--; + } + } + count_d += sector_num; + return; +} + +static void mark_dirty(BlkMigDevState *bmds, int64_t sector, int sector_num) +{ + int i; + + for(i = 0; i < sector_num; i++) { + if(bmds->bitmap[sector + i] == 0) { + bmds->bitmap[sector + i] = 1; + bmds->dirty++; + } + } + + return; +} + +static void mark_dirty_handler(BlockDriverState *bs, int64_t sector, int sector_num) +{ + BlkMigDevState *bmds; + + if(bs->type != BDRV_TYPE_HD || bs->device_name[0] == '\0') { + return; + } + + bmds = bs->dirty_control; + if(bmds == NULL) { + printf("%s:Error can not find device state\n", __FUNCTION__); + return; + } + + mark_dirty(bmds, sector, sector_num); + + return; +} + +static void enable_dirty_tracking(void) +{ + register_bdrv_dirty_tracking(mark_dirty_handler); + + return; +} + +static void disable_dirty_tracking(void) +{ + unregister_bdrv_dirty_tracking(); + + return; +} + +static int is_stage2_completed(void) +{ + BlkMigDevState *bmds; + + if(submitted > 0) { + return 0; + } + + for (bmds = block_mig_state.bmds_first; bmds != NULL; bmds = bmds->next) { + if(bmds->bulk_completed == 0) { + return 0; + } + } + + return 1; +} + +static int block_save_live(QEMUFile *f, int stage, void *opaque) +{ + int ret = 1; + + dprintf("Enter save live stage %d submitted %d transferred %d\n", stage, + submitted, transferred); + + if(block_mig_state.blk_enable != 1) { + /* no need to migrate storage */ + + qemu_put_be64(f,BLK_MIG_FLAG_EOS); + return 1; + } + + if(stage == 1) { + init_blk_migration(f); + + /* start track dirty blocks */ + enable_dirty_tracking(); + } + + flush_blks(f); + + + while (submitted + read_done < (qemu_file_get_rate_limit(f) / BLOCK_SIZE)) { + + ret = blk_mig_save_bulked_block(f, 1); + + if (ret == 0) /* no more bulk blocks for now*/ + break; + } + + flush_blks(f); + + if(stage == 3) { + /* stop track dirty blocks */ + disable_dirty_tracking(); + + while(blk_mig_save_bulked_block(f, 0) != 0); + + blk_mig_save_dirty_blocks(f); + printf("\nBlock migration completed\n"); + } + + qemu_put_be64(f,BLK_MIG_FLAG_EOS); + + return ((stage == 2) && is_stage2_completed()); +} + +static void blk_mig_write_cb(void *opaque, int ret) +{ + BlkMigBlock *blk = opaque; + + qemu_free(blk); + + return; +} + +static int block_load(QEMUFile *f, void *opaque, int version_id) +{ + int len, flags; + char device_name[256]; + int64_t addr; + BlockDriverState *bs; + BlkMigBlock *blk; + + do { + blk = qemu_malloc(sizeof(BlkMigBlock)); + if(blk == NULL) { + printf("Error failed to allocate buffer, stop migration\n"); + break; + } + + addr = qemu_get_be64(f); + + flags = addr & ~SECTOR_MASK; + addr &= SECTOR_MASK; + + if(flags & BLK_MIG_FLAG_DEVICE_BLOCK) { + + /* get device name */ + len = qemu_get_byte(f); + + qemu_get_buffer(f, (uint8_t *)device_name, len); + device_name[len] = '\0'; + + bs = bdrv_find(device_name); + + qemu_get_buffer(f, blk->buf, BLOCK_SIZE); + if(bs != NULL) { + + blk->iov.iov_base = blk->buf; + blk->iov.iov_len = BLOCK_SIZE; + qemu_iovec_init_external(&blk->qiov, &blk->iov, 1); + + blk->aiocb = bdrv_aio_writev(bs, (addr >> SECTOR_BITS), &blk->qiov, + SECTORS_PER_BLOCK, blk_mig_write_cb, blk); + } else { + printf("Error unknown block device %s\n", device_name); + qemu_free(blk); + } + } else if(flags & BLK_MIG_FLAG_EOS) { + qemu_free(blk); + } else { + printf("Unknown flags\n"); + qemu_free(blk); + } + } while(!(flags & BLK_MIG_FLAG_EOS)); + + return 0; +} + +static void block_set_params(int blk_enable, int shared_base, void *opaque) +{ + assert(opaque == &block_mig_state); + + block_mig_state.blk_enable = blk_enable; + block_mig_state.shared_base = shared_base; + + return; +} + +void blk_mig_info(void) +{ + BlockDriverState *bs; + + for (bs = bdrv_first; bs != NULL; bs = bs->next) { + printf("Device %s\n", bs->device_name); + if(bs->type == BDRV_TYPE_HD) { + printf("device %s format %s\n", bs->device_name, bs->drv->format_name); + } + } +} + +void blk_mig_init(void) +{ + + memset(&block_mig_state, 0, sizeof(BlkMigState)); + + register_savevm_live("block", 0, 1, block_set_params, block_save_live, + NULL, block_load, &block_mig_state); + + +} diff --git a/block-migration.h b/block-migration.h new file mode 100644 index 0000000..0f23711 --- /dev/null +++ b/block-migration.h @@ -0,0 +1,30 @@ +/* + * QEMU live migration + * + * Copyright IBM, Corp. 2009 + * + * Authors: + * Liran Schour + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + */ + +#ifndef BLOCK_MIGRATION_H +#define BLOCK_MIGRATION_H + +typedef struct BlkMigDevState { + BlockDriverState *bs; + int bulk_completed; + int shared_base; + struct BlkMigDevState *next; + int64_t cur_sector; + int64_t total_sectors; + int64_t dirty; + uint8_t *bitmap; +} BlkMigDevState; + +void blk_mig_init(void); +void blk_mig_info(void); +#endif /* BLOCK_MIGRATION_H */ diff --git a/buffered_file.c b/buffered_file.c index 364b912..1cf94a4 100644 --- a/buffered_file.c +++ b/buffered_file.c @@ -211,6 +211,13 @@ out: return s->xfer_limit; } +static size_t buffered_get_rate_limit(void *opaque) +{ + QEMUFileBuffered *s = opaque; + + return s->xfer_limit; +} + static void buffered_rate_tick(void *opaque) { QEMUFileBuffered *s = opaque; @@ -251,7 +258,8 @@ QEMUFile *qemu_fopen_ops_buffered(void *opaque, s->file = qemu_fopen_ops(s, buffered_put_buffer, NULL, buffered_close, buffered_rate_limit, - buffered_set_rate_limit); + buffered_set_rate_limit, + buffered_get_rate_limit); s->timer = qemu_new_timer(rt_clock, buffered_rate_tick, s); diff --git a/hw/hw.h b/hw/hw.h index c835800..11c52d3 100644 --- a/hw/hw.h +++ b/hw/hw.h @@ -41,12 +41,14 @@ typedef int (QEMUFileRateLimit)(void *opaque); * the old rate otherwise */ typedef size_t (QEMUFileSetRateLimit)(void *opaque, size_t new_rate); +typedef size_t (QEMUFileGetRateLimit)(void *opaque); QEMUFile *qemu_fopen_ops(void *opaque, QEMUFilePutBufferFunc *put_buffer, QEMUFileGetBufferFunc *get_buffer, QEMUFileCloseFunc *close, QEMUFileRateLimit *rate_limit, - QEMUFileSetRateLimit *set_rate_limit); + QEMUFileSetRateLimit *set_rate_limit, + QEMUFileGetRateLimit *get_rate_limit); QEMUFile *qemu_fopen(const char *filename, const char *mode); QEMUFile *qemu_fopen_socket(int fd); QEMUFile *qemu_popen(FILE *popen_file, const char *mode); @@ -82,6 +84,7 @@ unsigned int qemu_get_be32(QEMUFile *f); uint64_t qemu_get_be64(QEMUFile *f); int qemu_file_rate_limit(QEMUFile *f); size_t qemu_file_set_rate_limit(QEMUFile *f, size_t new_rate); +size_t qemu_file_get_rate_limit(QEMUFile *f); int qemu_file_has_error(QEMUFile *f); void qemu_file_set_error(QEMUFile *f); @@ -236,6 +239,7 @@ static inline void qemu_get_sbe64s(QEMUFile *f, int64_t *pv) int64_t qemu_ftell(QEMUFile *f); int64_t qemu_fseek(QEMUFile *f, int64_t pos, int whence); +typedef void SaveSetParamsHandler(int blk_enable, int shared, void * opaque); typedef void SaveStateHandler(QEMUFile *f, void *opaque); typedef int SaveLiveStateHandler(QEMUFile *f, int stage, void *opaque); typedef int LoadStateHandler(QEMUFile *f, void *opaque, int version_id); @@ -250,7 +254,8 @@ int register_savevm(const char *idstr, int register_savevm_live(const char *idstr, int instance_id, int version_id, - SaveLiveStateHandler *save_live_state, + SaveSetParamsHandler *set_params, + SaveLiveStateHandler *save_live_state, SaveStateHandler *save_state, LoadStateHandler *load_state, void *opaque); diff --git a/migration-exec.c b/migration-exec.c index 0dd5aff..4624d55 100644 --- a/migration-exec.c +++ b/migration-exec.c @@ -53,8 +53,10 @@ static int exec_close(FdMigrationState *s) } MigrationState *exec_start_outgoing_migration(const char *command, - int64_t bandwidth_limit, - int detach) + int64_t bandwidth_limit, + int detach, + int blk, + int inc) { FdMigrationState *s; FILE *f; @@ -87,6 +89,9 @@ MigrationState *exec_start_outgoing_migration(const char *command, s->mig_state.get_status = migrate_fd_get_status; s->mig_state.release = migrate_fd_release; + s->mig_state.blk = blk; + s->mig_state.shared = inc; + s->state = MIG_STATE_ACTIVE; s->mon_resume = NULL; s->bandwidth_limit = bandwidth_limit; diff --git a/migration-tcp.c b/migration-tcp.c index 1f4358e..7dfea98 100644 --- a/migration-tcp.c +++ b/migration-tcp.c @@ -78,7 +78,9 @@ static void tcp_wait_for_connect(void *opaque) MigrationState *tcp_start_outgoing_migration(const char *host_port, int64_t bandwidth_limit, - int detach) + int detach, + int blk, + int inc) { struct sockaddr_in addr; FdMigrationState *s; @@ -95,7 +97,10 @@ MigrationState *tcp_start_outgoing_migration(const char *host_port, s->mig_state.cancel = migrate_fd_cancel; s->mig_state.get_status = migrate_fd_get_status; s->mig_state.release = migrate_fd_release; - + + s->mig_state.blk = blk; + s->mig_state.shared = inc; + s->state = MIG_STATE_ACTIVE; s->mon_resume = NULL; s->bandwidth_limit = bandwidth_limit; diff --git a/migration.c b/migration.c index 190b37e..4380579 100644 --- a/migration.c +++ b/migration.c @@ -48,16 +48,19 @@ void qemu_start_incoming_migration(const char *uri) fprintf(stderr, "unknown migration protocol: %s\n", uri); } -void do_migrate(Monitor *mon, int detach, const char *uri) +void do_migrate(Monitor *mon, int detach, const char *blk, const char *inc, + const char *uri) { MigrationState *s = NULL; const char *p; if (strstart(uri, "tcp:", &p)) - s = tcp_start_outgoing_migration(p, max_throttle, detach); + s = tcp_start_outgoing_migration(p, max_throttle, detach, + (blk != NULL), (inc != NULL)); #if !defined(WIN32) else if (strstart(uri, "exec:", &p)) - s = exec_start_outgoing_migration(p, max_throttle, detach); + s = exec_start_outgoing_migration(p, max_throttle, detach, + (blk != NULL), (inc != NULL)); #endif else monitor_printf(mon, "unknown migration protocol: %s\n", uri); @@ -239,7 +242,7 @@ void migrate_fd_connect(FdMigrationState *s) migrate_fd_close); dprintf("beginning savevm\n"); - ret = qemu_savevm_state_begin(s->file); + ret = qemu_savevm_state_begin(s->file, s->mig_state.blk, s->mig_state.shared); if (ret < 0) { dprintf("failed, %d\n", ret); migrate_fd_error(s); diff --git a/migration.h b/migration.h index 37c7f8e..e38a433 100644 --- a/migration.h +++ b/migration.h @@ -29,6 +29,8 @@ struct MigrationState void (*cancel)(MigrationState *s); int (*get_status)(MigrationState *s); void (*release)(MigrationState *s); + int blk; + int shared; }; typedef struct FdMigrationState FdMigrationState; @@ -49,7 +51,8 @@ struct FdMigrationState void qemu_start_incoming_migration(const char *uri); -void do_migrate(Monitor *mon, int detach, const char *uri); +void do_migrate(Monitor *mon, int detach, const char *block, const char *inc, + const char *uri); void do_migrate_cancel(Monitor *mon); @@ -64,14 +67,18 @@ void do_info_migrate(Monitor *mon); int exec_start_incoming_migration(const char *host_port); MigrationState *exec_start_outgoing_migration(const char *host_port, - int64_t bandwidth_limit, - int detach); + int64_t bandwidth_limit, + int detach, + int blk, + int inc); int tcp_start_incoming_migration(const char *host_port); MigrationState *tcp_start_outgoing_migration(const char *host_port, int64_t bandwidth_limit, - int detach); + int detach, + int blk, + int inc); void migrate_fd_monitor_suspend(FdMigrationState *s); diff --git a/savevm.c b/savevm.c index 17da35a..ca2f6a6 100644 --- a/savevm.c +++ b/savevm.c @@ -160,6 +160,7 @@ struct QEMUFile { QEMUFileCloseFunc *close; QEMUFileRateLimit *rate_limit; QEMUFileSetRateLimit *set_rate_limit; + QEMUFileGetRateLimit *get_rate_limit; void *opaque; int is_write; @@ -247,9 +248,9 @@ QEMUFile *qemu_popen(FILE *popen_file, const char *mode) s->popen_file = popen_file; if(mode[0] == 'r') { - s->file = qemu_fopen_ops(s, NULL, popen_get_buffer, popen_close, NULL, NULL); + s->file = qemu_fopen_ops(s, NULL, popen_get_buffer, popen_close, NULL, NULL, NULL); } else { - s->file = qemu_fopen_ops(s, popen_put_buffer, NULL, popen_close, NULL, NULL); + s->file = qemu_fopen_ops(s, popen_put_buffer, NULL, popen_close, NULL, NULL, NULL); } return s->file; } @@ -282,7 +283,7 @@ QEMUFile *qemu_fopen_socket(int fd) QEMUFileSocket *s = qemu_mallocz(sizeof(QEMUFileSocket)); s->fd = fd; - s->file = qemu_fopen_ops(s, NULL, socket_get_buffer, socket_close, NULL, NULL); + s->file = qemu_fopen_ops(s, NULL, socket_get_buffer, socket_close, NULL, NULL, NULL); return s->file; } @@ -326,9 +327,9 @@ QEMUFile *qemu_fopen(const char *filename, const char *mode) goto fail; if (!strcmp(mode, "wb")) - return qemu_fopen_ops(s, file_put_buffer, NULL, file_close, NULL, NULL); + return qemu_fopen_ops(s, file_put_buffer, NULL, file_close, NULL, NULL, NULL); else if (!strcmp(mode, "rb")) - return qemu_fopen_ops(s, NULL, file_get_buffer, file_close, NULL, NULL); + return qemu_fopen_ops(s, NULL, file_get_buffer, file_close, NULL, NULL, NULL); fail: if (s->outfile) @@ -374,16 +375,17 @@ static QEMUFile *qemu_fopen_bdrv(BlockDriverState *bs, int64_t offset, int is_wr s->base_offset = offset; if (is_writable) - return qemu_fopen_ops(s, block_put_buffer, NULL, bdrv_fclose, NULL, NULL); + return qemu_fopen_ops(s, block_put_buffer, NULL, bdrv_fclose, NULL, NULL, NULL); - return qemu_fopen_ops(s, NULL, block_get_buffer, bdrv_fclose, NULL, NULL); + return qemu_fopen_ops(s, NULL, block_get_buffer, bdrv_fclose, NULL, NULL, NULL); } QEMUFile *qemu_fopen_ops(void *opaque, QEMUFilePutBufferFunc *put_buffer, QEMUFileGetBufferFunc *get_buffer, QEMUFileCloseFunc *close, QEMUFileRateLimit *rate_limit, - QEMUFileSetRateLimit *set_rate_limit) + QEMUFileSetRateLimit *set_rate_limit, + QEMUFileGetRateLimit *get_rate_limit) { QEMUFile *f; @@ -395,6 +397,7 @@ QEMUFile *qemu_fopen_ops(void *opaque, QEMUFilePutBufferFunc *put_buffer, f->close = close; f->rate_limit = rate_limit; f->set_rate_limit = set_rate_limit; + f->get_rate_limit = get_rate_limit; f->is_write = 0; return f; @@ -572,6 +575,14 @@ int qemu_file_rate_limit(QEMUFile *f) return 0; } +size_t qemu_file_get_rate_limit(QEMUFile *f) +{ + if (f->get_rate_limit) + return f->get_rate_limit(f->opaque); + + return 0; +} + size_t qemu_file_set_rate_limit(QEMUFile *f, size_t new_rate) { if (f->set_rate_limit) @@ -631,6 +642,7 @@ typedef struct SaveStateEntry { int instance_id; int version_id; int section_id; + SaveSetParamsHandler *set_params; SaveLiveStateHandler *save_live_state; SaveStateHandler *save_state; LoadStateHandler *load_state; @@ -647,7 +659,8 @@ static SaveStateEntry *first_se; int register_savevm_live(const char *idstr, int instance_id, int version_id, - SaveLiveStateHandler *save_live_state, + SaveSetParamsHandler *set_params, + SaveLiveStateHandler *save_live_state, SaveStateHandler *save_state, LoadStateHandler *load_state, void *opaque) @@ -655,11 +668,12 @@ int register_savevm_live(const char *idstr, SaveStateEntry *se, **pse; static int global_section_id; - se = qemu_malloc(sizeof(SaveStateEntry)); + se = qemu_mallocz(sizeof(SaveStateEntry)); pstrcpy(se->idstr, sizeof(se->idstr), idstr); se->instance_id = (instance_id == -1) ? 0 : instance_id; se->version_id = version_id; se->section_id = global_section_id++; + se->set_params = set_params; se->save_live_state = save_live_state; se->save_state = save_state; se->load_state = load_state; @@ -687,7 +701,8 @@ int register_savevm(const char *idstr, void *opaque) { return register_savevm_live(idstr, instance_id, version_id, - NULL, save_state, load_state, opaque); + NULL, NULL, save_state, load_state, + opaque); } void unregister_savevm(const char *idstr, void *opaque) @@ -716,10 +731,17 @@ void unregister_savevm(const char *idstr, void *opaque) #define QEMU_VM_SECTION_END 0x03 #define QEMU_VM_SECTION_FULL 0x04 -int qemu_savevm_state_begin(QEMUFile *f) +int qemu_savevm_state_begin(QEMUFile *f, int blk_enable, int shared) { SaveStateEntry *se; + for (se = first_se; se != NULL; se = se->next) { + if(se->set_params == NULL) { + continue; + } + se->set_params(blk_enable, shared, se->opaque); + } + qemu_put_be32(f, QEMU_VM_FILE_MAGIC); qemu_put_be32(f, QEMU_VM_FILE_VERSION); @@ -829,7 +851,7 @@ int qemu_savevm_state(QEMUFile *f) bdrv_flush_all(); - ret = qemu_savevm_state_begin(f); + ret = qemu_savevm_state_begin(f, 0, 0); if (ret < 0) goto out; diff --git a/sysemu.h b/sysemu.h index 686228d..0bc49f8 100644 --- a/sysemu.h +++ b/sysemu.h @@ -65,7 +65,7 @@ void qemu_announce_self(void); void main_loop_wait(int timeout); -int qemu_savevm_state_begin(QEMUFile *f); +int qemu_savevm_state_begin(QEMUFile *f, int blk_enable, int shared); int qemu_savevm_state_iterate(QEMUFile *f); int qemu_savevm_state_complete(QEMUFile *f); int qemu_savevm_state(QEMUFile *f); diff --git a/vl.c b/vl.c index 7278999..d625059 100644 --- a/vl.c +++ b/vl.c @@ -152,6 +152,7 @@ int main(int argc, char **argv) #include "qemu-char.h" #include "cache-utils.h" #include "block.h" +#include "block-migration.h" #include "dma.h" #include "audio/audio.h" #include "migration.h" @@ -6059,6 +6060,8 @@ int main(int argc, char **argv, char **envp) bdrv_init(); + blk_mig_init(); + /* we always create the cdrom drive, even if no disk is there */ if (nb_drives_opt < MAX_DRIVES) @@ -6083,8 +6086,9 @@ int main(int argc, char **argv, char **envp) exit(1); register_savevm("timer", 0, 2, timer_save, timer_load, NULL); - register_savevm_live("ram", 0, 3, ram_save_live, NULL, ram_load, NULL); - + register_savevm_live("ram", 0, 3, NULL, ram_save_live, NULL, + ram_load, NULL); + #ifndef _WIN32 /* must be after terminal init, SDL library changes signal handlers */ sighandler_setup();