{"id":807523,"url":"http://patchwork.ozlabs.org/api/1.0/patches/807523/?format=json","project":{"id":14,"url":"http://patchwork.ozlabs.org/api/1.0/projects/14/?format=json","name":"QEMU Development","link_name":"qemu-devel","list_id":"qemu-devel.nongnu.org","list_email":"qemu-devel@nongnu.org","web_url":"","scm_url":"","webscm_url":""},"msgid":"<1504081950-2528-21-git-send-email-peterx@redhat.com>","date":"2017-08-30T08:32:17","name":"[RFC,v2,20/33] migration: new message MIG_RP_MSG_RECV_BITMAP","commit_ref":null,"pull_url":null,"state":"new","archived":false,"hash":"899e524bdd54743eb56b07d72e204adecd777d8d","submitter":{"id":67717,"url":"http://patchwork.ozlabs.org/api/1.0/people/67717/?format=json","name":"Peter Xu","email":"peterx@redhat.com"},"delegate":null,"mbox":"http://patchwork.ozlabs.org/project/qemu-devel/patch/1504081950-2528-21-git-send-email-peterx@redhat.com/mbox/","series":[{"id":552,"url":"http://patchwork.ozlabs.org/api/1.0/series/552/?format=json","date":"2017-08-30T08:31:59","name":"Migration: postcopy failure recovery","version":2,"mbox":"http://patchwork.ozlabs.org/series/552/mbox/"}],"check":"pending","checks":"http://patchwork.ozlabs.org/api/patches/807523/checks/","tags":{},"headers":{"Return-Path":"<qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>","X-Original-To":"incoming@patchwork.ozlabs.org","Delivered-To":"patchwork-incoming@bilbo.ozlabs.org","Authentication-Results":["ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=nongnu.org\n\t(client-ip=2001:4830:134:3::11; helo=lists.gnu.org;\n\tenvelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org;\n\treceiver=<UNKNOWN>)","ext-mx03.extmail.prod.ext.phx2.redhat.com;\n\tdmarc=none (p=none dis=none) header.from=redhat.com","ext-mx03.extmail.prod.ext.phx2.redhat.com;\n\tspf=fail smtp.mailfrom=peterx@redhat.com"],"Received":["from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11])\n\t(using TLSv1 with cipher AES256-SHA (256/256 bits))\n\t(No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3xhzx54BsQz9t2Q\n\tfor <incoming@patchwork.ozlabs.org>;\n\tWed, 30 Aug 2017 19:00:25 +1000 (AEST)","from localhost ([::1]:49155 helo=lists.gnu.org)\n\tby lists.gnu.org with esmtp (Exim 4.71) (envelope-from\n\t<qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>)\n\tid 1dmyrH-0005sd-8H\n\tfor incoming@patchwork.ozlabs.org; Wed, 30 Aug 2017 05:00:23 -0400","from eggs.gnu.org ([2001:4830:134:3::10]:34652)\n\tby lists.gnu.org with esmtp (Exim 4.71)\n\t(envelope-from <peterx@redhat.com>) id 1dmySE-000869-1D\n\tfor qemu-devel@nongnu.org; Wed, 30 Aug 2017 04:34:31 -0400","from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)\n\t(envelope-from <peterx@redhat.com>) id 1dmySC-0003mF-9W\n\tfor qemu-devel@nongnu.org; Wed, 30 Aug 2017 04:34:30 -0400","from mx1.redhat.com ([209.132.183.28]:49748)\n\tby eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)\n\t(Exim 4.71) (envelope-from <peterx@redhat.com>) id 1dmySC-0003lx-0J\n\tfor qemu-devel@nongnu.org; Wed, 30 Aug 2017 04:34:28 -0400","from smtp.corp.redhat.com\n\t(int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15])\n\t(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))\n\t(No client certificate requested)\n\tby mx1.redhat.com (Postfix) with ESMTPS id 01B447E42F;\n\tWed, 30 Aug 2017 08:34:27 +0000 (UTC)","from pxdev.xzpeter.org.com (dhcp-14-103.nay.redhat.com\n\t[10.66.14.103])\n\tby smtp.corp.redhat.com (Postfix) with ESMTP id D30AB84792;\n\tWed, 30 Aug 2017 08:34:13 +0000 (UTC)"],"DMARC-Filter":"OpenDMARC Filter v1.3.2 mx1.redhat.com 01B447E42F","From":"Peter Xu <peterx@redhat.com>","To":"qemu-devel@nongnu.org","Date":"Wed, 30 Aug 2017 16:32:17 +0800","Message-Id":"<1504081950-2528-21-git-send-email-peterx@redhat.com>","In-Reply-To":"<1504081950-2528-1-git-send-email-peterx@redhat.com>","References":"<1504081950-2528-1-git-send-email-peterx@redhat.com>","X-Scanned-By":"MIMEDefang 2.79 on 10.5.11.15","X-Greylist":"Sender IP whitelisted, not delayed by milter-greylist-4.5.16\n\t(mx1.redhat.com [10.5.110.27]);\n\tWed, 30 Aug 2017 08:34:27 +0000 (UTC)","X-detected-operating-system":"by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]\n\t[fuzzy]","X-Received-From":"209.132.183.28","Subject":"[Qemu-devel] [RFC v2 20/33] migration: new message\n\tMIG_RP_MSG_RECV_BITMAP","X-BeenThere":"qemu-devel@nongnu.org","X-Mailman-Version":"2.1.21","Precedence":"list","List-Id":"<qemu-devel.nongnu.org>","List-Unsubscribe":"<https://lists.nongnu.org/mailman/options/qemu-devel>,\n\t<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>","List-Archive":"<http://lists.nongnu.org/archive/html/qemu-devel/>","List-Post":"<mailto:qemu-devel@nongnu.org>","List-Help":"<mailto:qemu-devel-request@nongnu.org?subject=help>","List-Subscribe":"<https://lists.nongnu.org/mailman/listinfo/qemu-devel>,\n\t<mailto:qemu-devel-request@nongnu.org?subject=subscribe>","Cc":"Laurent Vivier <lvivier@redhat.com>,\n\tAndrea Arcangeli <aarcange@redhat.com>, \n\tJuan Quintela <quintela@redhat.com>,\n\tAlexey Perevalov <a.perevalov@samsung.com>, peterx@redhat.com,\n\t\"Dr . David Alan Gilbert\" <dgilbert@redhat.com>","Errors-To":"qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org","Sender":"\"Qemu-devel\"\n\t<qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>"},"content":"Introducing new return path message MIG_RP_MSG_RECV_BITMAP to send\nreceived bitmap of ramblock back to source.\n\nThis is the reply message of MIG_CMD_RECV_BITMAP, it contains not only\nthe header (including the ramblock name), and it was appended with the\nwhole ramblock received bitmap on the destination side.\n\nWhen the source receives such a reply message (MIG_RP_MSG_RECV_BITMAP),\nit parses it, convert it to the dirty bitmap by inverting the bits.\n\nOne thing to mention is that, when we send the recv bitmap, we are doing\nthese things in extra:\n\n- converting the bitmap to little endian, to support when hosts are\n  using different endianess on src/dst.\n\n- do proper alignment for 8 bytes, to support when hosts are using\n  different word size (32/64 bits) on src/dst.\n\nSigned-off-by: Peter Xu <peterx@redhat.com>\n---\n migration/migration.c  |  68 ++++++++++++++++++++++++\n migration/migration.h  |   2 +\n migration/ram.c        | 141 +++++++++++++++++++++++++++++++++++++++++++++++++\n migration/ram.h        |   3 ++\n migration/savevm.c     |   2 +-\n migration/trace-events |   2 +\n 6 files changed, 217 insertions(+), 1 deletion(-)","diff":"diff --git a/migration/migration.c b/migration/migration.c\nindex 1370c70..625f19a 100644\n--- a/migration/migration.c\n+++ b/migration/migration.c\n@@ -92,6 +92,7 @@ enum mig_rp_message_type {\n \n     MIG_RP_MSG_REQ_PAGES_ID, /* data (start: be64, len: be32, id: string) */\n     MIG_RP_MSG_REQ_PAGES,    /* data (start: be64, len: be32) */\n+    MIG_RP_MSG_RECV_BITMAP,  /* send recved_bitmap back to source */\n \n     MIG_RP_MSG_MAX\n };\n@@ -449,6 +450,45 @@ void migrate_send_rp_pong(MigrationIncomingState *mis,\n     migrate_send_rp_message(mis, MIG_RP_MSG_PONG, sizeof(buf), &buf);\n }\n \n+void migrate_send_rp_recv_bitmap(MigrationIncomingState *mis,\n+                                 char *block_name)\n+{\n+    char buf[512];\n+    int len;\n+    int64_t res;\n+\n+    /*\n+     * First, we send the header part. It contains only the len of\n+     * idstr, and the idstr itself.\n+     */\n+    len = strlen(block_name);\n+    buf[0] = len;\n+    memcpy(buf + 1, block_name, len);\n+\n+    if (mis->state != MIGRATION_STATUS_POSTCOPY_RECOVER) {\n+        error_report(\"%s: MSG_RP_RECV_BITMAP only used for recovery\",\n+                     __func__);\n+        return;\n+    }\n+\n+    migrate_send_rp_message(mis, MIG_RP_MSG_RECV_BITMAP, len + 1, buf);\n+\n+    /*\n+     * Next, we dump the received bitmap to the stream.\n+     *\n+     * TODO: currently we are safe since we are the only one that is\n+     * using the to_src_file handle (fault thread is still paused),\n+     * and it's ok even not taking the mutex. However the best way is\n+     * to take the lock before sending the message header, and release\n+     * the lock after sending the bitmap.\n+     */\n+    qemu_mutex_lock(&mis->rp_mutex);\n+    res = ramblock_recv_bitmap_send(mis->to_src_file, block_name);\n+    qemu_mutex_unlock(&mis->rp_mutex);\n+\n+    trace_migrate_send_rp_recv_bitmap(block_name, res);\n+}\n+\n MigrationCapabilityStatusList *qmp_query_migrate_capabilities(Error **errp)\n {\n     MigrationCapabilityStatusList *head = NULL;\n@@ -1572,6 +1612,7 @@ static struct rp_cmd_args {\n     [MIG_RP_MSG_PONG]           = { .len =  4, .name = \"PONG\" },\n     [MIG_RP_MSG_REQ_PAGES]      = { .len = 12, .name = \"REQ_PAGES\" },\n     [MIG_RP_MSG_REQ_PAGES_ID]   = { .len = -1, .name = \"REQ_PAGES_ID\" },\n+    [MIG_RP_MSG_RECV_BITMAP]    = { .len = -1, .name = \"RECV_BITMAP\" },\n     [MIG_RP_MSG_MAX]            = { .len = -1, .name = \"MAX\" },\n };\n \n@@ -1616,6 +1657,19 @@ static bool postcopy_pause_return_path_thread(MigrationState *s)\n     return true;\n }\n \n+static int migrate_handle_rp_recv_bitmap(MigrationState *s, char *block_name)\n+{\n+    RAMBlock *block = qemu_ram_block_by_name(block_name);\n+\n+    if (!block) {\n+        error_report(\"%s: invalid block name '%s'\", __func__, block_name);\n+        return -EINVAL;\n+    }\n+\n+    /* Fetch the received bitmap and refresh the dirty bitmap */\n+    return ram_dirty_bitmap_reload(s, block);\n+}\n+\n /*\n  * Handles messages sent on the return path towards the source VM\n  *\n@@ -1721,6 +1775,20 @@ retry:\n             migrate_handle_rp_req_pages(ms, (char *)&buf[13], start, len);\n             break;\n \n+        case MIG_RP_MSG_RECV_BITMAP:\n+            if (header_len < 1) {\n+                error_report(\"%s: missing block name\", __func__);\n+                mark_source_rp_bad(ms);\n+                goto out;\n+            }\n+            /* Format: len (1B) + idstr (<255B). This ends the idstr. */\n+            buf[buf[0] + 1] = '\\0';\n+            if (migrate_handle_rp_recv_bitmap(ms, (char *)(buf + 1))) {\n+                mark_source_rp_bad(ms);\n+                goto out;\n+            }\n+            break;\n+\n         default:\n             break;\n         }\ndiff --git a/migration/migration.h b/migration/migration.h\nindex b78b9bd..4051379 100644\n--- a/migration/migration.h\n+++ b/migration/migration.h\n@@ -202,5 +202,7 @@ void migrate_send_rp_pong(MigrationIncomingState *mis,\n                           uint32_t value);\n int migrate_send_rp_req_pages(MigrationIncomingState *mis, const char* rbname,\n                               ram_addr_t start, size_t len);\n+void migrate_send_rp_recv_bitmap(MigrationIncomingState *mis,\n+                                 char *block_name);\n \n #endif\ndiff --git a/migration/ram.c b/migration/ram.c\nindex 7e20097..5d938e3 100644\n--- a/migration/ram.c\n+++ b/migration/ram.c\n@@ -182,6 +182,70 @@ void ramblock_recv_bitmap_clear(RAMBlock *rb, void *host_addr)\n     clear_bit(ramblock_recv_bitmap_offset(host_addr, rb), rb->receivedmap);\n }\n \n+#define  RAMBLOCK_RECV_BITMAP_ENDING  (0x0123456789abcdefULL)\n+\n+/*\n+ * Format: bitmap_size (8 bytes) + whole_bitmap (N bytes).\n+ *\n+ * Returns >0 if success with sent bytes, or <0 if error.\n+ */\n+int64_t ramblock_recv_bitmap_send(QEMUFile *file,\n+                                  const char *block_name)\n+{\n+    RAMBlock *block = qemu_ram_block_by_name(block_name);\n+    unsigned long *le_bitmap, nbits;\n+    uint64_t size;\n+\n+    if (!block) {\n+        error_report(\"%s: invalid block name: %s\", __func__, block_name);\n+        return -1;\n+    }\n+\n+    nbits = block->used_length >> TARGET_PAGE_BITS;\n+\n+    /*\n+     * Make sure the tmp bitmap buffer is big enough, e.g., on 32bit\n+     * machines we may need 4 more bytes for padding (see below\n+     * comment). So extend it a bit before hand.\n+     */\n+    le_bitmap = bitmap_new(nbits + BITS_PER_LONG);\n+\n+    /*\n+     * Always use little endian when sending the bitmap. This is\n+     * required that when source and destination VMs are not using the\n+     * same endianess. (Note: big endian won't work.)\n+     */\n+    bitmap_to_le(le_bitmap, block->receivedmap, nbits);\n+\n+    /* Size of the bitmap, in bytes */\n+    size = nbits / 8;\n+\n+    /*\n+     * size is always aligned to 8 bytes for 64bit machines, but it\n+     * may not be true for 32bit machines. We need this padding to\n+     * make sure the migration can survive even between 32bit and\n+     * 64bit machines.\n+     */\n+    size = ROUND_UP(size, 8);\n+\n+    qemu_put_be64(file, size);\n+    qemu_put_buffer(file, (const uint8_t *)le_bitmap, size);\n+    /*\n+     * Mark as an end, in case the middle part is screwed up due to\n+     * some \"misterious\" reason.\n+     */\n+    qemu_put_be64(file, RAMBLOCK_RECV_BITMAP_ENDING);\n+    qemu_fflush(file);\n+\n+    free(le_bitmap);\n+\n+    if (qemu_file_get_error(file)) {\n+        return qemu_file_get_error(file);\n+    }\n+\n+    return size + sizeof(size);\n+}\n+\n /*\n  * An outstanding page request, on the source, having been received\n  * and queued\n@@ -2706,6 +2770,83 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)\n     return ret;\n }\n \n+/*\n+ * Read the received bitmap, revert it as the initial dirty bitmap.\n+ * This is only used when the postcopy migration is paused but wants\n+ * to resume from a middle point.\n+ */\n+int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block)\n+{\n+    int ret = -EINVAL;\n+    QEMUFile *file = s->rp_state.from_dst_file;\n+    unsigned long *le_bitmap, nbits = block->used_length >> TARGET_PAGE_BITS;\n+    uint64_t local_size = nbits / 8;\n+    uint64_t size, end_mark;\n+\n+    if (s->state != MIGRATION_STATUS_POSTCOPY_RECOVER) {\n+        error_report(\"%s: incorrect state %s\", __func__,\n+                     MigrationStatus_lookup[s->state]);\n+        return -EINVAL;\n+    }\n+\n+    /*\n+     * Note: see comments in ramblock_recv_bitmap_send() on why we\n+     * need the endianess convertion, and the paddings.\n+     */\n+    local_size = ROUND_UP(local_size, 8);\n+\n+    /* Add addings */\n+    le_bitmap = bitmap_new(nbits + BITS_PER_LONG);\n+\n+    size = qemu_get_be64(file);\n+\n+    /* The size of the bitmap should match with our ramblock */\n+    if (size != local_size) {\n+        error_report(\"%s: ramblock '%s' bitmap size mismatch \"\n+                     \"(0x%lx != 0x%lx)\", __func__, block->idstr,\n+                     size, local_size);\n+        ret = -EINVAL;\n+        goto out;\n+    }\n+\n+    size = qemu_get_buffer(file, (uint8_t *)le_bitmap, local_size);\n+    end_mark = qemu_get_be64(file);\n+\n+    ret = qemu_file_get_error(file);\n+    if (ret || size != local_size) {\n+        error_report(\"%s: read bitmap failed for ramblock '%s': %d\",\n+                     __func__, block->idstr, ret);\n+        ret = -EIO;\n+        goto out;\n+    }\n+\n+    if (end_mark != RAMBLOCK_RECV_BITMAP_ENDING) {\n+        error_report(\"%s: ramblock '%s' end mark incorrect: 0x%\"PRIu64,\n+                     __func__, block->idstr, end_mark);\n+        ret = -EINVAL;\n+        goto out;\n+    }\n+\n+    /*\n+     * Endianess convertion. We are during postcopy (though paused).\n+     * The dirty bitmap won't change. We can directly modify it.\n+     */\n+    bitmap_from_le(block->bmap, le_bitmap, nbits);\n+\n+    /*\n+     * What we received is \"received bitmap\". Revert it as the initial\n+     * dirty bitmap for this ramblock.\n+     */\n+    bitmap_complement(block->bmap, block->bmap, nbits);\n+\n+    trace_ram_dirty_bitmap_reload(block->idstr);\n+\n+    ret = 0;\n+out:\n+    free(le_bitmap);\n+    return ret;\n+}\n+\n static SaveVMHandlers savevm_ram_handlers = {\n     .save_setup = ram_save_setup,\n     .save_live_iterate = ram_save_iterate,\ndiff --git a/migration/ram.h b/migration/ram.h\nindex 4db9922..bd4b8ba 100644\n--- a/migration/ram.h\n+++ b/migration/ram.h\n@@ -57,5 +57,8 @@ int ramblock_recv_bitmap_test(RAMBlock *rb, void *host_addr);\n void ramblock_recv_bitmap_set(RAMBlock *rb, void *host_addr);\n void ramblock_recv_bitmap_set_range(RAMBlock *rb, void *host_addr, size_t nr);\n void ramblock_recv_bitmap_clear(RAMBlock *rb, void *host_addr);\n+int64_t ramblock_recv_bitmap_send(QEMUFile *file,\n+                                  const char *block_name);\n+int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *rb);\n \n #endif\ndiff --git a/migration/savevm.c b/migration/savevm.c\nindex f532ca0..7f77a31 100644\n--- a/migration/savevm.c\n+++ b/migration/savevm.c\n@@ -1766,7 +1766,7 @@ static int loadvm_handle_recv_bitmap(MigrationIncomingState *mis,\n         return -EINVAL;\n     }\n \n-    /* TODO: send the bitmap back to source */\n+    migrate_send_rp_recv_bitmap(mis, block_name);\n \n     trace_loadvm_handle_recv_bitmap(block_name);\n \ndiff --git a/migration/trace-events b/migration/trace-events\nindex c5f7e41..9960cd8 100644\n--- a/migration/trace-events\n+++ b/migration/trace-events\n@@ -78,6 +78,7 @@ ram_load_postcopy_loop(uint64_t addr, int flags) \"@%\" PRIx64 \" %x\"\n ram_postcopy_send_discard_bitmap(void) \"\"\n ram_save_page(const char *rbname, uint64_t offset, void *host) \"%s: offset: 0x%\" PRIx64 \" host: %p\"\n ram_save_queue_pages(const char *rbname, size_t start, size_t len) \"%s: start: 0x%zx len: 0x%zx\"\n+ram_dirty_bitmap_reload(char *str) \"%s\"\n \n # migration/migration.c\n await_return_path_close_on_source_close(void) \"\"\n@@ -89,6 +90,7 @@ migrate_fd_cancel(void) \"\"\n migrate_handle_rp_req_pages(const char *rbname, size_t start, size_t len) \"in %s at 0x%zx len 0x%zx\"\n migrate_pending(uint64_t size, uint64_t max, uint64_t post, uint64_t nonpost) \"pending size %\" PRIu64 \" max %\" PRIu64 \" (post=%\" PRIu64 \" nonpost=%\" PRIu64 \")\"\n migrate_send_rp_message(int msg_type, uint16_t len) \"%d: len %d\"\n+migrate_send_rp_recv_bitmap(char *name, int64_t size) \"block '%s' size 0x%\"PRIi64\n migration_completion_file_err(void) \"\"\n migration_completion_postcopy_end(void) \"\"\n migration_completion_postcopy_end_after_complete(void) \"\"\n","prefixes":["RFC","v2","20/33"]}