From patchwork Thu Feb 28 08:09:44 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wayne Xia X-Patchwork-Id: 223811 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 6A3832C0084 for ; Thu, 28 Feb 2013 19:14:56 +1100 (EST) Received: from localhost ([::1]:54420 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UAydi-00032E-Hm for incoming@patchwork.ozlabs.org; Thu, 28 Feb 2013 03:14:54 -0500 Received: from eggs.gnu.org ([208.118.235.92]:50985) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UAybr-0000DA-Ad for qemu-devel@nongnu.org; Thu, 28 Feb 2013 03:13:02 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UAybp-0004UF-9m for qemu-devel@nongnu.org; Thu, 28 Feb 2013 03:12:59 -0500 Received: from e28smtp08.in.ibm.com ([122.248.162.8]:60388) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UAybo-0004TP-HV for qemu-devel@nongnu.org; Thu, 28 Feb 2013 03:12:57 -0500 Received: from /spool/local by e28smtp08.in.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 28 Feb 2013 13:39:08 +0530 Received: from d28dlp01.in.ibm.com (9.184.220.126) by e28smtp08.in.ibm.com (192.168.1.138) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Thu, 28 Feb 2013 13:38:42 +0530 Received: from d28relay02.in.ibm.com (d28relay02.in.ibm.com [9.184.220.59]) by d28dlp01.in.ibm.com (Postfix) with ESMTP id 3AD6EE0050 for ; Thu, 28 Feb 2013 13:43:32 +0530 (IST) Received: from d28av01.in.ibm.com (d28av01.in.ibm.com [9.184.220.63]) by d28relay02.in.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id r1S8CM9p30539988 for ; Thu, 28 Feb 2013 13:42:22 +0530 Received: from d28av01.in.ibm.com (loopback [127.0.0.1]) by d28av01.in.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id r1S8CNeo013572 for ; Thu, 28 Feb 2013 08:12:23 GMT Received: from RH63Wenchao (wenchaox.cn.ibm.com [9.115.122.12] (may be forged)) by d28av01.in.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id r1S89njC004622; Thu, 28 Feb 2013 08:12:22 GMT From: Wenchao Xia To: qemu-devel@nongnu.org Date: Thu, 28 Feb 2013 16:09:44 +0800 Message-Id: <1362038985-19008-4-git-send-email-xiawenc@linux.vnet.ibm.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1362038985-19008-1-git-send-email-xiawenc@linux.vnet.ibm.com> References: <1362038985-19008-1-git-send-email-xiawenc@linux.vnet.ibm.com> X-Content-Scanned: Fidelis XPS MAILER x-cbid: 13022808-2000-0000-0000-00000B1DA1E2 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] X-Received-From: 122.248.162.8 Cc: kwolf@redhat.com, aliguori@us.ibm.com, quintela@redhat.com, stefanha@gmail.com, pbonzini@redhat.com, Wenchao Xia Subject: [Qemu-devel] [PATCH 3/4] ram: add support for seekable file save X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org It use plane layout for ram. Signed-off-by: Wenchao Xia --- arch_init.c | 242 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 242 insertions(+), 0 deletions(-) diff --git a/arch_init.c b/arch_init.c index 8daeafa..0c12095 100644 --- a/arch_init.c +++ b/arch_init.c @@ -657,6 +657,245 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) return total_sent; } +/* ram save for seekable file support */ +/* RAMBlock lay out: + every RAMBlock: page_number = block->length >> TARGET_PAGE_BITS + 1st page: + 8 bytes : offset | cont | fag + 1 byte: strlen(block->idstr) + n bytes: block->idstr + TARGET_PAGE_SIZE bytes: page content. + other pages: + 8 bytes : offset | cont | fag + TARGET_PAGE_SIZE bytes: page content. + Total: + 8+1+strlen(block->idstr)+TARGET_PAGE_SIZE + + (page_number - 1) * (8 + TARGET_PAGE_SIZE) = + 1+strlen(block->idstr)+ page_number * (8 + TARGET_PAGE_SIZE) + RAM_SAVE_FLAG_EOS is not included in it. +*/ + +static int64_t ram_file_get_block_size(RAMBlock *block) +{ + g_assert((block->length & (TARGET_PAGE_SIZE - 1)) == 0); + int64_t page_num = block->length >> TARGET_PAGE_BITS; + int64_t block_size = 1 + strlen(block->idstr) + + page_num * (8 + TARGET_PAGE_SIZE); + return block_size; +} + +static int64_t ram_file_get_block_total_size(void) +{ + int64_t total_size = 0, block_size; + + RAMBlock *block = QTAILQ_FIRST(&ram_list.blocks); + while (block != NULL) { + block_size = ram_file_get_block_size(block); + total_size += block_size; + block = QTAILQ_NEXT(block, next); + } + return total_size; +} + +static int64_t ram_file_get_addr_offset(RAMBlock *block, ram_addr_t offset) +{ + int64_t nr = offset >> TARGET_PAGE_BITS; + int64_t ret; + + if (nr == 0) { + ret = 0; + } else { + ret = 1 + strlen(block->idstr) + nr * (8 + TARGET_PAGE_SIZE); + } + return ret; +} + +/* + * ram_save_block_seekable: Writes a page of memory to a seekable f, used + * write the data to local images, so wirte them with address and do not + * compress, let block layer handler the compression. + * + * Returns: The number of bytes written. + * 0 means no dirty pages + */ +/* assume block would not change during the live save vmstate */ +static int ram_save_block_seekable(QEMUFile *f, int64_t base, int64_t max) +{ + RAMBlock *block = last_seen_block; + ram_addr_t offset = last_offset; + bool complete_round = false; + int bytes_sent = 0; + MemoryRegion *mr; + static uint64_t block_offset; + + if (!block) { + block = QTAILQ_FIRST(&ram_list.blocks); + block_offset = 0; + } + + while (true) { + mr = block->mr; + offset = migration_bitmap_find_and_reset_dirty(mr, offset); + if (complete_round && block == last_seen_block && + offset >= last_offset) { + break; + } + if (offset >= block->length) { + offset = 0; + block_offset += ram_file_get_block_size(block); + block = QTAILQ_NEXT(block, next); + if (!block) { + block = QTAILQ_FIRST(&ram_list.blocks); + complete_round = true; + block_offset = 0; + } + } else { + uint8_t *p; + uint64_t addr_offset, file_offset; + int ret; + + int cont = (block == last_sent_block) ? + RAM_SAVE_FLAG_CONTINUE : 0; + + p = memory_region_get_ram_ptr(mr) + offset; + + addr_offset = ram_file_get_addr_offset(block, offset); + file_offset = base + block_offset + addr_offset; + if (file_offset > max) { + printf("error!:%ld, max %ld.\n", file_offset, max); + return -1; + } + ret = qemu_fseek(f, file_offset); + if (ret < 0) { + return ret; + } + + /* normal page */ + bytes_sent = save_block_hdr(f, block, offset, cont, + RAM_SAVE_FLAG_PAGE); + qemu_put_buffer(f, p, TARGET_PAGE_SIZE); + bytes_sent += TARGET_PAGE_SIZE; + acct_info.norm_pages++; + + /* if page is unmodified, continue to the next */ + if (bytes_sent > 0) { + last_sent_block = block; + break; + } + } + } + last_seen_block = block; + last_offset = offset; + + return bytes_sent; +} + +#define RAM_SAVE_FLAG_EOS_SIZE (8) +static int ram_save_iterate_seekable(QEMUFile *f, int64_t base, + int64_t max, void *opaque) +{ + int ret; + int i; + int64_t t0; + int total_sent = 0; + + qemu_mutex_lock_ramlist(); + + if (ram_list.version != last_version) { + reset_ram_globals(); + } + + t0 = qemu_get_clock_ns(rt_clock); + i = 0; + while ((ret = qemu_file_rate_limit(f)) == 0) { + int bytes_sent; + + bytes_sent = ram_save_block_seekable(f, base, + max - RAM_SAVE_FLAG_EOS_SIZE); + /* no more blocks to sent */ + if (bytes_sent == 0) { + break; + } else if (bytes_sent < 0) { + return bytes_sent; + } + total_sent += bytes_sent; + acct_info.iterations++; + /* we want to check in the 1st loop, just in case it was the 1st time + and we had to sync the dirty bitmap. + qemu_get_clock_ns() is a bit expensive, so we only check each some + iterations + */ + if ((i & 63) == 0) { + uint64_t t1 = (qemu_get_clock_ns(rt_clock) - t0) / 1000000; + if (t1 > MAX_WAIT) { + DPRINTF("big wait: %" PRIu64 " milliseconds, %d iterations\n", + t1, i); + break; + } + } + i++; + } + + qemu_mutex_unlock_ramlist(); + + if (ret < 0) { + bytes_transferred += total_sent; + return ret; + } + + bytes_transferred += total_sent; + + return total_sent; +} + +static int64_t ram_save_iterate_seekable_get_size(void *opaque) +{ + return ram_file_get_block_total_size() + RAM_SAVE_FLAG_EOS_SIZE; +} + +static int ram_save_complete_seekable(QEMUFile *f, int64_t iterate_base, + int64_t iterate_max, int64_t complete_base, void *opaque) +{ + qemu_mutex_lock_ramlist(); + migration_bitmap_sync(); + + /* try transferring iterative blocks of memory */ + + /* flush all remaining blocks regardless of rate limiting */ + while (true) { + int bytes_sent; + + bytes_sent = ram_save_block_seekable(f, iterate_base, + iterate_max - RAM_SAVE_FLAG_EOS_SIZE); + /* no more blocks to sent */ + if (bytes_sent == 0) { + break; + } else if (bytes_sent < 0) { + return bytes_sent; + } + bytes_transferred += bytes_sent; + } + /* put on flag */ + int ret = qemu_fseek(f, ram_file_get_block_total_size() + iterate_base); + if (ret < 0) { + return ret; + } + qemu_put_be64(f, RAM_SAVE_FLAG_EOS); + bytes_transferred += 8; + + ret = qemu_fseek(f, complete_base); + if (ret < 0) { + return ret; + } + + migration_end(); + + qemu_mutex_unlock_ramlist(); + qemu_put_be64(f, RAM_SAVE_FLAG_EOS); + + return 0; +} + static int ram_save_complete(QEMUFile *f, void *opaque) { qemu_mutex_lock_ramlist(); @@ -880,6 +1119,9 @@ SaveVMHandlers savevm_ram_handlers = { .save_live_iterate = ram_save_iterate, .save_live_complete = ram_save_complete, .save_live_pending = ram_save_pending, + .save_live_iterate_seekable = ram_save_iterate_seekable, + .save_live_iterate_seekable_get_size = ram_save_iterate_seekable_get_size, + .save_live_complete_seekable = ram_save_complete_seekable, .load_state = ram_load, .cancel = ram_migration_cancel, };