From patchwork Fri Apr 14 13:17:18 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexey Perevalov X-Patchwork-Id: 750852 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3w4JF51PcLz9s8C for ; Fri, 14 Apr 2017 23:20:41 +1000 (AEST) Received: from localhost ([::1]:53284 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cz19S-0000p7-N4 for incoming@patchwork.ozlabs.org; Fri, 14 Apr 2017 09:20:38 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:49182) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cz16i-0007Rz-UV for qemu-devel@nongnu.org; Fri, 14 Apr 2017 09:17:51 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cz16f-0008EM-LM for qemu-devel@nongnu.org; Fri, 14 Apr 2017 09:17:48 -0400 Received: from mailout1.w1.samsung.com ([210.118.77.11]:24106) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1cz16f-0008DO-AH for qemu-devel@nongnu.org; Fri, 14 Apr 2017 09:17:45 -0400 Received: from eucas1p2.samsung.com (unknown [182.198.249.207]) by mailout1.w1.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTP id <0OOE00FYQI9HA800@mailout1.w1.samsung.com> for qemu-devel@nongnu.org; Fri, 14 Apr 2017 14:17:41 +0100 (BST) Received: from eusmges1.samsung.com (unknown [203.254.199.239]) by eucas1p2.samsung.com (KnoxPortal) with ESMTP id 20170414131741eucas1p2135df2d85b1c87ad0d35e426f64916ce~1Rgo1KGx92464124641eucas1p2D; Fri, 14 Apr 2017 13:17:41 +0000 (GMT) Received: from eucas1p1.samsung.com ( [182.198.249.206]) by eusmges1.samsung.com (EUCPMTA) with SMTP id EC.E6.14140.7FBC0F85; Fri, 14 Apr 2017 14:17:43 +0100 (BST) Received: from eusmgms1.samsung.com (unknown [182.198.249.179]) by eucas1p2.samsung.com (KnoxPortal) with ESMTP id 20170414131740eucas1p27eba648b990a93a627265c740e7ff118~1Rgn0o4xi1945719457eucas1p2t; Fri, 14 Apr 2017 13:17:40 +0000 (GMT) X-AuditID: cbfec7ef-f796a6d00000373c-9e-58f0cbf72bd0 Received: from eusync2.samsung.com ( [203.254.199.212]) by eusmgms1.samsung.com (EUCPMTA) with SMTP id 2A.2D.17452.17CC0F85; Fri, 14 Apr 2017 14:19:45 +0100 (BST) Received: from aperevalov-ubuntu.rnd.samsung.ru ([106.109.129.199]) by eusync2.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTPA id <0OOE00KVHI9AY150@eusync2.samsung.com>; Fri, 14 Apr 2017 14:17:40 +0100 (BST) From: Alexey Perevalov To: dgilbert@redhat.com, qemu-devel@nongnu.org Date: Fri, 14 Apr 2017 16:17:18 +0300 Message-id: <1492175840-5021-5-git-send-email-a.perevalov@samsung.com> X-Mailer: git-send-email 1.9.1 In-reply-to: <1492175840-5021-1-git-send-email-a.perevalov@samsung.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFlrHIsWRmVeSWpSXmKPExsWy7djPc7rfT3+IMFjQqWMx9+55FovebffY La60/2S3ON67g8WBxePJtc1MHu/3XWXz6NuyijGAOYrLJiU1J7MstUjfLoEr4/8G6YKjExgr ji08yNjA2JjbxcjBISFgInF3j2YXIyeQKSZx4d56ti5GLg4hgWWMEl9mrYJyPjNKPN63gRmi ykRiVscGFhAbrKrncwZEUTeTxNTLL9hBprIJGEjsu2cLUiMioCdx5VsnI4jNLGAs0fLzOTuI LSxgJ9G5ZTUjSDmLgKrE7osZIGFeATeJrasnMUGskpM4eWwyK4jNKeAu8erHShaQVRICZ9gk Vt4/ygbxgKzEpgPMEKaLxJrJohCtwhKvjm9hh7BlJDo7DjJBtLYzSnTv7GSFcCYwSpyZ/heq yl7i1M2rTBB38klM2jYdaiivREebEESJh8SnnTtZIWxHidPfN7BCvD6LUaLv6mWWCYwyCxgZ VjGKpJYW56anFhvqFSfmFpfmpesl5+duYgTG4ul/x9/vYHzaHHKIUYCDUYmH98LxDxFCrIll xZW5hxglOJiVRHijTwGFeFMSK6tSi/Lji0pzUosPMUpzsCiJ8/KeuhYhJJCeWJKanZpakFoE k2Xi4JRqYDTs+V+dONN9wWvRHWJzBUNbuR0jp5dyue57MHnv5WpHyUnmn2VWC1oc76pN9LGz WFtZzSAUfuGYxcfm/RwWOskHsnRtnk4v/nNgxXP307eUDX8WfPNNbG6PPsA+PefNtTmPTwg8 1IzItWo2vjo38qvQj6O/Xm/atNCUO3avvOv94vdif2ZUJimxFGckGmoxFxUnAgAUHq5lwQIA AA== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrPLMWRmVeSWpSXmKPExsVy+t/xK7qFZz5EGKxqVbeYe/c8i0Xvtnvs Flfaf7JbHO/dweLA4vHk2mYmj/f7rrJ59G1ZxRjAHOVmk5GamJJapJCal5yfkpmXbqsUGuKm a6GkkJeYm2qrFKHrGxKkpFCWmFMK5BkZoAEH5wD3YCV9uwS3jP8bpAuOTmCsOLbwIGMDY2Nu FyMnh4SAicSsjg0sELaYxIV769m6GLk4hASWMEq0TbwO5fQySRzYtJapi5GDg03AQGLfPVuQ BhEBPYkr3zoZQWxmAWOJlp/P2UFsYQE7ic4tqxlBylkEVCV2X8wACfMKuElsXT2JCWKXnMTJ Y5NZQWxOAXeJVz9Wgt0gBFSz8vR7pgmMvAsYGVYxiqSWFuem5xYb6hUn5haX5qXrJefnbmIE Bua2Yz8372C8tDH4EKMAB6MSD2/F0Q8RQqyJZcWVuYcYJTiYlUR4o08BhXhTEiurUovy44tK c1KLDzGaAt00kVlKNDkfGDV5JfGGJobmloZGxhYW5kZGSuK8JR+uhAsJpCeWpGanphakFsH0 MXFwSjUw6jfW5zavsfrRd03hhvbs2aJzr2keztvqo9VpdESlIrhm3pTb2r/F2mNbpum3eH2p 0O1QX177qM7lokvcdUGBvct2SF7hvLPKMXzTJ7mcd3PXTmbtV/mydn6ome/M8B2GGyZe9dl+ 7Lm10ZTQaHOj9S0VBvrvq2ZveRO9zih0i/6X9PWxy8IilViKMxINtZiLihMBpnPyr2ICAAA= X-MTR: 20000000000000000@CPGS X-CMS-MailID: 20170414131740eucas1p27eba648b990a93a627265c740e7ff118 X-Msg-Generator: CA X-Sender-IP: 182.198.249.179 X-Local-Sender: =?UTF-8?B?QWxleGV5IFBlcmV2YWxvdhtTUlItVmlydHVhbGl6YXRpb24g?= =?UTF-8?B?TGFiG+yCvOyEseyghOyekBtTZW5pb3IgRW5naW5lZXI=?= X-Global-Sender: =?UTF-8?B?QWxleGV5IFBlcmV2YWxvdhtTUlItVmlydHVhbGl6YXRpb24g?= =?UTF-8?B?TGFiG1NhbXN1bmcgRWxlY3Ryb25pY3MbU2VuaW9yIEVuZ2luZWVy?= X-Sender-Code: =?UTF-8?B?QzEwG0NJU0hRG0MxMEdEMDFHRDAxMDE1NA==?= CMS-TYPE: 201P X-HopCount: 7 X-CMS-RootMailID: 20170414131740eucas1p27eba648b990a93a627265c740e7ff118 X-RootMTR: 20170414131740eucas1p27eba648b990a93a627265c740e7ff118 References: <1492175840-5021-1-git-send-email-a.perevalov@samsung.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 210.118.77.11 Subject: [Qemu-devel] [PATCH 4/6] migration: calculate downtime on dst side X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: i.maximets@samsung.com, a.perevalov@samsung.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" This patch provides downtime calculation per vCPU, as a summary and as a overlapped value for all vCPUs. This approach just keeps tree with page fault addr as a key, and t1-t2 interval of pagefault time and page copy time, with affected vCPU bit mask. For more implementation details please see comment to get_postcopy_total_downtime function. Signed-off-by: Alexey Perevalov --- include/migration/migration.h | 14 +++ migration/migration.c | 280 +++++++++++++++++++++++++++++++++++++++++- migration/postcopy-ram.c | 24 +++- migration/qemu-file.c | 1 - migration/trace-events | 9 +- 5 files changed, 323 insertions(+), 5 deletions(-) diff --git a/include/migration/migration.h b/include/migration/migration.h index 5720c88..5d2c628 100644 --- a/include/migration/migration.h +++ b/include/migration/migration.h @@ -123,10 +123,24 @@ struct MigrationIncomingState { /* See savevm.c */ LoadStateEntry_Head loadvm_handlers; + + /* + * Tree for keeping postcopy downtime, + * necessary to calculate correct downtime, during multiple + * vm suspends, it keeps host page address as a key and + * DowntimeDuration as a data + * NULL means kernel couldn't provide process thread id, + * and QEMU couldn't identify which vCPU raise page fault + */ + GTree *postcopy_downtime; }; MigrationIncomingState *migration_incoming_get_current(void); void migration_incoming_state_destroy(void); +void mark_postcopy_downtime_begin(uint64_t addr, int cpu); +void mark_postcopy_downtime_end(uint64_t addr); +uint64_t get_postcopy_total_downtime(void); +void destroy_downtime_duration(gpointer data); /* * An outstanding page request, on the source, having been received diff --git a/migration/migration.c b/migration/migration.c index 79f6425..5bac434 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -38,6 +38,8 @@ #include "io/channel-tls.h" #include "migration/colo.h" +#define DEBUG_VCPU_DOWNTIME 1 + #define MAX_THROTTLE (32 << 20) /* Migration transfer speed throttling */ /* Amount of time to allocate to each "chunk" of bandwidth-throttled @@ -77,6 +79,19 @@ static NotifierList migration_state_notifiers = static bool deferred_incoming; +typedef struct { + int64_t begin; + int64_t end; + uint64_t *cpus; /* cpus bit mask array, QEMU bit functions support + bit operation on memory regions, but doesn't check out of range */ +} DowntimeDuration; + +typedef struct { + int64_t tp; /* point in time */ + bool is_end; + uint64_t *cpus; +} OverlapDowntime; + /* * Current state of incoming postcopy; note this is not part of * MigrationIncomingState since it's state is used during cleanup @@ -117,6 +132,13 @@ MigrationState *migrate_get_current(void) return ¤t_migration; } +void destroy_downtime_duration(gpointer data) +{ + DowntimeDuration *dd = (DowntimeDuration *)data; + g_free(dd->cpus); + g_free(data); +} + MigrationIncomingState *migration_incoming_get_current(void) { static bool once; @@ -138,10 +160,13 @@ void migration_incoming_state_destroy(void) struct MigrationIncomingState *mis = migration_incoming_get_current(); qemu_event_destroy(&mis->main_thread_load_event); + if (mis->postcopy_downtime) { + g_tree_destroy(mis->postcopy_downtime); + mis->postcopy_downtime = NULL; + } loadvm_free_handlers(mis); } - typedef struct { bool optional; uint32_t size; @@ -1754,7 +1779,6 @@ static int postcopy_start(MigrationState *ms, bool *old_vm_running) */ ms->postcopy_after_devices = true; notifier_list_notify(&migration_state_notifiers, ms); - ms->downtime = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) - time_at_stop; qemu_mutex_unlock_iothread(); @@ -2117,3 +2141,255 @@ PostcopyState postcopy_state_set(PostcopyState new_state) return atomic_xchg(&incoming_postcopy_state, new_state); } +#define SIZE_TO_KEEP_CPUBITS (1 + smp_cpus/sizeof(guint64)) + +void mark_postcopy_downtime_begin(uint64_t addr, int cpu) +{ + MigrationIncomingState *mis = migration_incoming_get_current(); + DowntimeDuration *dd; + if (!mis->postcopy_downtime) { + return; + } + + dd = g_tree_lookup(mis->postcopy_downtime, (gpointer)addr); /* !!! cast */ + if (!dd) { + dd = (DowntimeDuration *)g_new0(DowntimeDuration, 1); + dd->cpus = g_new0(guint64, SIZE_TO_KEEP_CPUBITS); + g_tree_insert(mis->postcopy_downtime, (gpointer)addr, (gpointer)dd); + } + + if (cpu < 0) { + /* assume in this situation all vCPUs are sleeping */ + int i; + for (i = 0; i < SIZE_TO_KEEP_CPUBITS; i++) { + dd->cpus[i] = ~(uint64_t)0u; + } + } else + set_bit(cpu, dd->cpus); + + /* + * overwrite previously set dd->begin, if that page already was + * faulted on another cpu + */ + dd->begin = qemu_clock_get_ms(QEMU_CLOCK_REALTIME); + trace_mark_postcopy_downtime_begin(addr, dd, dd->begin, cpu); +} + +void mark_postcopy_downtime_end(uint64_t addr) +{ + MigrationIncomingState *mis = migration_incoming_get_current(); + DowntimeDuration *dd; + if (!mis->postcopy_downtime) { + return; + } + + dd = g_tree_lookup(mis->postcopy_downtime, (gpointer)addr); + if (!dd) { + /* error_report("Could not populate downtime duration completion time \n\ + There is no downtime duration for 0x%"PRIx64, addr); */ + return; + } + + dd->end = qemu_clock_get_ms(QEMU_CLOCK_REALTIME); + trace_mark_postcopy_downtime_end(addr, dd, dd->end); +} + +struct downtime_overlay_cxt { + GPtrArray *downtime_points; + size_t number_of_points; +}; +/* + * This function split each DowntimeDuration, which represents as start/end + * pointand makes a points of it, then fill array with points, + * to sort it in future. + */ +static gboolean split_duration_and_fill_points(gpointer key, gpointer value, + gpointer data) +{ + struct downtime_overlay_cxt *ctx = (struct downtime_overlay_cxt *)data; + DowntimeDuration *dd = (DowntimeDuration *)value; + GPtrArray *interval = ctx->downtime_points; + if (dd->begin) { + OverlapDowntime *od_begin = g_new0(OverlapDowntime, 1); + od_begin->cpus = g_memdup(dd->cpus, sizeof(uint64_t) * SIZE_TO_KEEP_CPUBITS); + od_begin->tp = dd->begin; + od_begin->is_end = false; + g_ptr_array_add(interval, od_begin); + ctx->number_of_points += 1; + } + + if (dd->end) { + OverlapDowntime *od_end = g_new0(OverlapDowntime, 1); + od_end->cpus = g_memdup(dd->cpus, sizeof(uint64_t) * SIZE_TO_KEEP_CPUBITS); + od_end->tp = dd->end; + od_end->is_end = true; + g_ptr_array_add(interval, od_end); + ctx->number_of_points += 1; + } + + if (dd->end && dd->begin) + trace_split_duration_and_fill_points(dd->end - dd->begin, (uint64_t)key); + return FALSE; +} + +#ifdef DEBUG_VCPU_DOWNTIME +static gboolean calculate_per_cpu(gpointer key, gpointer value, + gpointer data) +{ + int *downtime_cpu = (int *)data; + DowntimeDuration *dd = (DowntimeDuration *)value; + int cpu_iter; + for (cpu_iter = 0; cpu_iter < smp_cpus; cpu_iter++) { + if (test_bit(cpu_iter, dd->cpus) && dd->end && dd->begin) + downtime_cpu[cpu_iter] += dd->end - dd->begin; + } + return FALSE; +} +#endif /* DEBUG_VCPU_DOWNTIME */ + +static gint compare_downtime(gconstpointer a, gconstpointer b) +{ + DowntimeDuration *dda = (DowntimeDuration *)a; + DowntimeDuration *ddb = (DowntimeDuration *)b; + return dda->begin - ddb->begin; +} + +static void destroy_overlap_downtime(gpointer data) +{ + OverlapDowntime *od = (OverlapDowntime *)data; + g_free(od->cpus); + g_free(data); +} + +static int check_overlap(uint64_t *b) +{ + unsigned long zero_bit = find_first_zero_bit(b, BITS_PER_LONG * SIZE_TO_KEEP_CPUBITS); + return zero_bit >= smp_cpus; +} + +/* + * This function calculates downtime per cpu and trace it + * + * Also it calculates total downtime as an interval's overlap, + * for many vCPU. + * + * The approach is following: + * Initially intervals are represented in tree where key is + * pagefault address, and values: + * begin - page fault time + * end - page load time + * cpus - bit mask shows affected cpus + * + * To calculate overlap on all cpus, intervals converted into + * array of points in time (downtime_points), the size of + * array is 2 * number of nodes in tree of intervals (2 array + * elements per one in element of interval). + * Each element is marked as end (E) or as start (S) of interval. + * The overlap downtime will be calculated for SE, only in case + * there is sequence S(0..N)E(M) for every vCPU. + * + * As example we have 3 CPU + * + * S1 E1 S1 E1 + * -----***********------------xxx***************------------------------> CPU1 + * + * S2 E2 + * ------------****************xxx---------------------------------------> CPU2 + * + * S3 E3 + * ------------------------****xxx********-------------------------------> CPU3 + * + * We have sequence S1,S2,E1,S3,S1,E2,E3,E1 + * S2,E1 - doesn't match condition due to sequence S1,S2,E1 doesn't include CPU3 + * S3,S1,E2 - sequenece includes all CPUs, in this case overlap will be S1,E2 + * Legend of picture is following: * - means downtime per vCPU + * x - means overlapped downtime + */ +uint64_t get_postcopy_total_downtime(void) +{ + MigrationIncomingState *mis = migration_incoming_get_current(); + uint64_t total_downtime = 0; /* for total overlapped downtime */ + const int intervals = g_tree_nnodes(mis->postcopy_downtime); + int point_iter, start_point_iter, i; + struct downtime_overlay_cxt dp_ctx = { 0 }; + /* + * array will contain 2 * interval points or less, if + * it was not page fault finalization for page, + * real count will be in ctx.number_of_points + */ + dp_ctx.downtime_points = g_ptr_array_new_full(2 * intervals, + destroy_overlap_downtime); + if (!mis->postcopy_downtime) { + goto out; + } + +#ifdef DEBUG_VCPU_DOWNTIME + { + gint *downtime_cpu = g_new0(int, smp_cpus); + g_tree_foreach(mis->postcopy_downtime, calculate_per_cpu, downtime_cpu); + for (point_iter = 0; point_iter < smp_cpus; point_iter++) + { + trace_downtime_per_cpu(point_iter, downtime_cpu[point_iter]); + } + g_free(downtime_cpu); + } +#endif /* DEBUG_VCPU_DOWNTIME */ + + /* make downtime points S/E from interval */ + g_tree_foreach(mis->postcopy_downtime, split_duration_and_fill_points, + &dp_ctx); + g_ptr_array_sort(dp_ctx.downtime_points, compare_downtime); + + for (point_iter = 1; point_iter < dp_ctx.number_of_points; + point_iter++) { + OverlapDowntime *od = g_ptr_array_index(dp_ctx.downtime_points, + point_iter); + uint64_t *cur_cpus; + int smp_cpus_i = smp_cpus; + OverlapDowntime *prev_od = g_ptr_array_index(dp_ctx.downtime_points, + point_iter - 1); + if (!od || !prev_od) + continue; + /* we need sequence SE */ + if (!od->is_end || prev_od->is_end) + continue; + + cur_cpus = g_memdup(od->cpus, sizeof(uint64_t) * SIZE_TO_KEEP_CPUBITS); + for (start_point_iter = point_iter - 1; + start_point_iter >= 0 && smp_cpus_i; + start_point_iter--, smp_cpus_i--) { + OverlapDowntime *t_od = g_ptr_array_index(dp_ctx.downtime_points, + start_point_iter); + if (!t_od) + break; + /* should be S */ + if (t_od->is_end) + break; + + /* points were sorted, it's possible when + * end is not occured, but this points were ommited + * in split_duration_and_fill_points */ + if (od->tp <= prev_od->tp) { + break; + } + + for (i = 0; i < SIZE_TO_KEEP_CPUBITS; i++) { + cur_cpus[i] |= t_od->cpus[i]; + } + + /* check_overlap - just count number of bits in cur_cpus, + * and compare it with smp_cpus */ + if (check_overlap(cur_cpus)) { + total_downtime += od->tp - prev_od->tp; + /* situation when one S point represents all vCPU is possible */ + break; + } + } + g_free(cur_cpus); + } + trace_get_postcopy_total_downtime(g_tree_nnodes(mis->postcopy_downtime), + total_downtime); +out: + g_ptr_array_free(dp_ctx.downtime_points, TRUE); + return total_downtime; +} diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c index 70f0480..ea89f4e 100644 --- a/migration/postcopy-ram.c +++ b/migration/postcopy-ram.c @@ -23,8 +23,10 @@ #include "migration/postcopy-ram.h" #include "sysemu/sysemu.h" #include "sysemu/balloon.h" +#include #include "qemu/error-report.h" #include "trace.h" +#include "glib/glib-helper.h" /* Arbitrary limit on size of each discard command, * keeps them around ~200 bytes @@ -81,6 +83,11 @@ static bool ufd_version_check(int ufd, MigrationIncomingState *mis) return false; } + if (mis && UFFD_FEATURE_THREAD_ID & api_struct.features) { + mis->postcopy_downtime = g_tree_new_full(g_int_cmp64, + NULL, NULL, destroy_downtime_duration); + } + if (getpagesize() != ram_pagesize_summary()) { bool have_hp = false; /* We've got a huge page */ @@ -404,6 +411,18 @@ static int ram_block_enable_notify(const char *block_name, void *host_addr, return 0; } +static int get_mem_fault_cpu_index(uint32_t pid) +{ + CPUState *cpu_iter; + + CPU_FOREACH(cpu_iter) { + if (cpu_iter->thread_id == pid) + return cpu_iter->cpu_index; + } + trace_get_mem_fault_cpu_index(pid); + return -1; +} + /* * Handle faults detected by the USERFAULT markings */ @@ -481,8 +500,10 @@ static void *postcopy_ram_fault_thread(void *opaque) rb_offset &= ~(qemu_ram_pagesize(rb) - 1); trace_postcopy_ram_fault_thread_request(msg.arg.pagefault.address, qemu_ram_get_idstr(rb), - rb_offset); + rb_offset, msg.arg.pagefault.feat.ptid); + mark_postcopy_downtime_begin(msg.arg.pagefault.address, + get_mem_fault_cpu_index(msg.arg.pagefault.feat.ptid)); /* * Send the request to the source - we want to request one * of our host page sizes (which is >= TPS) @@ -577,6 +598,7 @@ int postcopy_place_page(MigrationIncomingState *mis, void *host, void *from, return -e; } + mark_postcopy_downtime_end((uint64_t)host); trace_postcopy_place_page(host); return 0; diff --git a/migration/qemu-file.c b/migration/qemu-file.c index 195fa94..c9f3e47 100644 --- a/migration/qemu-file.c +++ b/migration/qemu-file.c @@ -547,7 +547,6 @@ size_t qemu_get_buffer_in_place(QEMUFile *f, uint8_t **buf, size_t size) int qemu_peek_byte(QEMUFile *f, int offset) { int index = f->buf_index + offset; - assert(!qemu_file_is_writable(f)); assert(offset < IO_BUF_SIZE); diff --git a/migration/trace-events b/migration/trace-events index 7372ce2..ab2e1e4 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -110,6 +110,12 @@ process_incoming_migration_co_end(int ret, int ps) "ret=%d postcopy-state=%d" process_incoming_migration_co_postcopy_end_main(void) "" migration_set_incoming_channel(void *ioc, const char *ioctype) "ioc=%p ioctype=%s" migration_set_outgoing_channel(void *ioc, const char *ioctype, const char *hostname) "ioc=%p ioctype=%s hostname=%s" +mark_postcopy_downtime_begin(uint64_t addr, void *dd, int64_t time, int cpu) "addr 0x%" PRIx64 " dd %p time %" PRId64 " cpu %d" +mark_postcopy_downtime_end(uint64_t addr, void *dd, int64_t time) "addr 0x%" PRIx64 " dd %p time %" PRId64 +get_postcopy_total_downtime(int num, uint64_t total) "faults %d, total downtime %" PRIu64 +split_duration_and_fill_points(int64_t downtime, uint64_t addr) "downtime %" PRId64 " addr 0x%" PRIx64 +downtime_per_cpu(int cpu_index, int downtime) "downtime cpu[%d]=%d" +source_return_path_thread_downtime(uint64_t downtime) "downtime %" PRIu64 # migration/rdma.c qemu_rdma_accept_incoming_migration(void) "" @@ -186,7 +192,7 @@ postcopy_ram_enable_notify(void) "" postcopy_ram_fault_thread_entry(void) "" postcopy_ram_fault_thread_exit(void) "" postcopy_ram_fault_thread_quit(void) "" -postcopy_ram_fault_thread_request(uint64_t hostaddr, const char *ramblock, size_t offset) "Request for HVA=%" PRIx64 " rb=%s offset=%zx" +postcopy_ram_fault_thread_request(uint64_t hostaddr, const char *ramblock, size_t offset, int pid) "Request for HVA=%" PRIx64 " rb=%s offset=%zx %d" postcopy_ram_incoming_cleanup_closeuf(void) "" postcopy_ram_incoming_cleanup_entry(void) "" postcopy_ram_incoming_cleanup_exit(void) "" @@ -195,6 +201,7 @@ save_xbzrle_page_skipping(void) "" save_xbzrle_page_overflow(void) "" ram_save_iterate_big_wait(uint64_t milliconds, int iterations) "big wait: %" PRIu64 " milliseconds, %d iterations" ram_load_complete(int ret, uint64_t seq_iter) "exit_code %d seq iteration %" PRIu64 +get_mem_fault_cpu_index(uint32_t pid) "pid %u is not vCPU" # migration/exec.c migration_exec_outgoing(const char *cmd) "cmd=%s"