From patchwork Fri May 12 13:31:24 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexey Perevalov X-Patchwork-Id: 761654 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3wPW9z3Wbcz9s0m for ; Fri, 12 May 2017 23:32:39 +1000 (AEST) Received: from localhost ([::1]:53835 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1d9AgP-0005Sl-3g for incoming@patchwork.ozlabs.org; Fri, 12 May 2017 09:32:37 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57678) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1d9Afi-0005O2-Ss for qemu-devel@nongnu.org; Fri, 12 May 2017 09:31:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1d9Afe-00072Z-GS for qemu-devel@nongnu.org; Fri, 12 May 2017 09:31:54 -0400 Received: from mailout1.w1.samsung.com ([210.118.77.11]:20657) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1d9Afe-00070o-BF for qemu-devel@nongnu.org; Fri, 12 May 2017 09:31:50 -0400 Received: from eucas1p1.samsung.com (unknown [182.198.249.206]) by mailout1.w1.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTP id <0OPU00GMPDKPLB10@mailout1.w1.samsung.com> for qemu-devel@nongnu.org; Fri, 12 May 2017 14:31:49 +0100 (BST) Received: from eusmges3.samsung.com (unknown [203.254.199.242]) by eucas1p2.samsung.com (KnoxPortal) with ESMTP id 20170512133148eucas1p26293173919ce8d7edf8179f2c422a42f~93w9nFd4v0943009430eucas1p2K; Fri, 12 May 2017 13:31:48 +0000 (GMT) Received: from eucas1p1.samsung.com ( [182.198.249.206]) by eusmges3.samsung.com (EUCPMTA) with SMTP id 04.C2.17464.449B5195; Fri, 12 May 2017 14:31:48 +0100 (BST) Received: from eusmgms2.samsung.com (unknown [182.198.249.180]) by eucas1p1.samsung.com (KnoxPortal) with ESMTP id 20170512133147eucas1p1eaa21aac3a0b9d45be0ef8ea903b6824~93w9BbKKZ2731027310eucas1p1E; Fri, 12 May 2017 13:31:47 +0000 (GMT) X-AuditID: cbfec7f2-f797e6d000004438-4e-5915b944dc3d Received: from eusync1.samsung.com ( [203.254.199.211]) by eusmgms2.samsung.com (EUCPMTA) with SMTP id 51.81.20206.369B5195; Fri, 12 May 2017 14:32:19 +0100 (BST) Received: from aperevalov-ubuntu.rnd.samsung.ru ([106.109.129.199]) by eusync1.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTPA id <0OPU00791DKM5L30@eusync1.samsung.com>; Fri, 12 May 2017 14:31:47 +0100 (BST) From: Alexey Perevalov To: qemu-devel@nongnu.org Date: Fri, 12 May 2017 16:31:24 +0300 Message-id: <1494595886-30912-8-git-send-email-a.perevalov@samsung.com> X-Mailer: git-send-email 1.9.1 In-reply-to: <1494595886-30912-1-git-send-email-a.perevalov@samsung.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrGIsWRmVeSWpSXmKPExsWy7djPc7ouO0UjDS48F7WYe/c8i0Xvtnvs Flfaf7JbbNn/jd3ieO8OFgdWjyfXNjN5vN93lc2jb8sqxgDmKC6blNSczLLUIn27BK6MCVsf Mhc0m1dsffuEuYHxlVYXIyeHhICJxOkPk5kgbDGJC/fWs3UxcnEICSxllFi+sw/K+cwo0bik gRGmY/ajn1CJZYwSx1bvh3K6mSQ2N55l7mLk4GATMJDYd88WpEFEQFLid9dpZhCbWSBW4u31 a2wgtrCAu8S14xfZQWwWAVWJO58fs4C08gLFO9s4IHbJSZw8NpkVxOYU8JB4fPcWK8gqCYHL bBKz2p6CrZIQkJXYdIAZot5F4ur5x1B3Cku8Or6FHcKWkbg8uZsForedUaJ7ZyfUoAmMEmem /4Wqspc4dfMqE8ShfBKTtk2HWsAr0dEmBFHiIbHxySeockeJf49+sUP8PptR4vPHBrYJjDIL GBlWMYqklhbnpqcWG+sVJ+YWl+al6yXn525iBMbm6X/HP+1g/HrC6hCjAAejEg+vwlrRSCHW xLLiytxDjBIczEoivIZbgEK8KYmVValF+fFFpTmpxYcYpTlYlMR5uU5dixASSE8sSc1OTS1I LYLJMnFwSjUwpr4Uar3h/DvmWK/249NpQnGpaUXRqYU/Lq1ZGbRX4f6LX4UV6jLaT4tUd9eu mytesoo7P6Vsu9HcP8tPNO3f3ub2SO3TqyNmF7TSU+LO1R2Y0r7guGGZ/qdKl7K9Lr1TFp2b ktHQHFl9fpvJ37qck9tMi3zWc8zh3FYnUC9xJ73BpzZiZly0EktxRqKhFnNRcSIAIvorQMkC AAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFrrCLMWRmVeSWpSXmKPExsVy+t/xy7rJO0UjDeZ0mlnMvXuexaJ32z12 iyvtP9kttuz/xm5xvHcHiwOrx5Nrm5k83u+7yubRt2UVYwBzlJtNRmpiSmqRQmpecn5KZl66 rVJoiJuuhZJCXmJuqq1ShK5vSJCSQlliTimQZ2SABhycA9yDlfTtEtwyJmx9yFzQbF6x9e0T 5gbGV1pdjJwcEgImErMf/WSDsMUkLtxbD2RzcQgJLGGUWNP/mxXC6WWSOHP5CGMXIwcHm4CB xL57tiANIgKSEr+7TjOD2MwCsRKXJrwAGyQs4C5x7fhFdhCbRUBV4s7nxywgrbxA8c42Dohd chInj01mBbE5BTwkHt+9BWYLAZVsa1jKNoGRdwEjwypGkdTS4tz03GIjveLE3OLSvHS95Pzc TYzAEN127OeWHYxd74IPMQpwMCrx8FasF40UYk0sK67MPcQowcGsJMJruAUoxJuSWFmVWpQf X1Sak1p8iNEU6KaJzFKiyfnA+MkriTc0MTS3NDQytrAwNzJSEued+uFKuJBAemJJanZqakFq EUwfEwenVAOjY1zjp/C3iqf2Naw/XzvbkqOy2iXx18vYrj3/2Q7893m+98+h8JsrE6MrLkY+ WXrI315crOib+4vHL46+l3063fCn2lk+Qd6Iv5p9phbbEqV63XVy+Vp+nyhcZThf4aH3gWVW 5jqf25Ujmv/PLQh1EGdVW7r8a1+3efQDGd3t92dIXz9cyNOoxFKckWioxVxUnAgAQBftc2cC AAA= X-MTR: 20000000000000000@CPGS X-CMS-MailID: 20170512133147eucas1p1eaa21aac3a0b9d45be0ef8ea903b6824 X-Msg-Generator: CA X-Sender-IP: 182.198.249.180 X-Local-Sender: =?UTF-8?B?QWxleGV5IFBlcmV2YWxvdhtTUlItVmlydHVhbGl6YXRpb24g?= =?UTF-8?B?TGFiG+yCvOyEseyghOyekBtTZW5pb3IgRW5naW5lZXI=?= X-Global-Sender: =?UTF-8?B?QWxleGV5IFBlcmV2YWxvdhtTUlItVmlydHVhbGl6YXRpb24g?= =?UTF-8?B?TGFiG1NhbXN1bmcgRWxlY3Ryb25pY3MbU2VuaW9yIEVuZ2luZWVy?= X-Sender-Code: =?UTF-8?B?QzEwG0NJU0hRG0MxMEdEMDFHRDAxMDE1NA==?= CMS-TYPE: 201P X-HopCount: 7 X-CMS-RootMailID: 20170512133147eucas1p1eaa21aac3a0b9d45be0ef8ea903b6824 X-RootMTR: 20170512133147eucas1p1eaa21aac3a0b9d45be0ef8ea903b6824 References: <1494595886-30912-1-git-send-email-a.perevalov@samsung.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 210.118.77.11 Subject: [Qemu-devel] [PATCH V5 7/9] migration: calculate vCPU blocktime on dst side X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: i.maximets@samsung.com, dgilbert@redhat.com, peterx@redhat.com, a.perevalov@samsung.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" This patch provides blocktime calculation per vCPU, as a summary and as a overlapped value for all vCPUs. This approach was suggested by Peter Xu, as an improvements of previous approch where QEMU kept tree with faulted page address and cpus bitmask in it. Now QEMU is keeping array with faulted page address as value and vCPU as index. It helps to find proper vCPU at UFFD_COPY time. Also it keeps list for blocktime per vCPU (could be traced with page_fault_addr) Blocktime will not calculated if postcopy_blocktime field of MigrationIncomingState wasn't initialized. Signed-off-by: Alexey Perevalov --- migration/postcopy-ram.c | 87 +++++++++++++++++++++++++++++++++++++++++++++++- migration/trace-events | 5 ++- 2 files changed, 90 insertions(+), 2 deletions(-) diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c index a1f1705..e2660ae 100644 --- a/migration/postcopy-ram.c +++ b/migration/postcopy-ram.c @@ -23,6 +23,7 @@ #include "migration/postcopy-ram.h" #include "sysemu/sysemu.h" #include "sysemu/balloon.h" +#include #include "qemu/error-report.h" #include "trace.h" @@ -542,6 +543,86 @@ static int ram_block_enable_notify(const char *block_name, void *host_addr, return 0; } +static int get_mem_fault_cpu_index(uint32_t pid) +{ + CPUState *cpu_iter; + + CPU_FOREACH(cpu_iter) { + if (cpu_iter->thread_id == pid) { + return cpu_iter->cpu_index; + } + } + trace_get_mem_fault_cpu_index(pid); + return -1; +} + +static void mark_postcopy_blocktime_begin(uint64_t addr, int cpu) +{ + MigrationIncomingState *mis = migration_incoming_get_current(); + PostcopyBlocktimeContext *dc; + int64_t now_ms; + if (!mis->blocktime_ctx || cpu < 0) { + return; + } + now_ms = qemu_clock_get_ms(QEMU_CLOCK_REALTIME); + dc = mis->blocktime_ctx; + if (dc->vcpu_addr[cpu] == 0) { + atomic_inc(&dc->smp_cpus_down); + } + + atomic_xchg__nocheck(&dc->vcpu_addr[cpu], addr); + atomic_xchg__nocheck(&dc->last_begin, now_ms); + atomic_xchg__nocheck(&dc->page_fault_vcpu_time[cpu], now_ms); + + trace_mark_postcopy_blocktime_begin(addr, dc, dc->page_fault_vcpu_time[cpu], + cpu); +} + +static void mark_postcopy_blocktime_end(uint64_t addr) +{ + MigrationIncomingState *mis = migration_incoming_get_current(); + PostcopyBlocktimeContext *dc; + int i, affected_cpu = 0; + int64_t now_ms; + bool vcpu_total_blocktime = false; + + if (!mis->blocktime_ctx) { + return; + } + dc = mis->blocktime_ctx; + now_ms = qemu_clock_get_ms(QEMU_CLOCK_REALTIME); + + /* lookup cpu, to clear it, + * that algorithm looks straighforward, but it's not + * optimal, more optimal algorithm is keeping tree or hash + * where key is address value is a list of */ + for (i = 0; i < smp_cpus; i++) { + uint64_t vcpu_blocktime = 0; + if (atomic_fetch_add(&dc->vcpu_addr[i], 0) != addr) { + continue; + } + atomic_xchg__nocheck(&dc->vcpu_addr[i], 0); + vcpu_blocktime = now_ms - + atomic_fetch_add(&dc->page_fault_vcpu_time[i], 0); + affected_cpu += 1; + /* we need to know is that mark_postcopy_end was due to + * faulted page, another possible case it's prefetched + * page and in that case we shouldn't be here */ + if (!vcpu_total_blocktime && + atomic_fetch_add(&dc->smp_cpus_down, 0) == smp_cpus) { + vcpu_total_blocktime = true; + } + /* continue cycle, due to one page could affect several vCPUs */ + dc->vcpu_blocktime[i] += vcpu_blocktime; + } + + atomic_sub(&dc->smp_cpus_down, affected_cpu); + if (vcpu_total_blocktime) { + dc->total_blocktime += now_ms - atomic_fetch_add(&dc->last_begin, 0); + } + trace_mark_postcopy_blocktime_end(addr, dc, dc->total_blocktime); +} + /* * Handle faults detected by the USERFAULT markings */ @@ -619,8 +700,11 @@ static void *postcopy_ram_fault_thread(void *opaque) rb_offset &= ~(qemu_ram_pagesize(rb) - 1); trace_postcopy_ram_fault_thread_request(msg.arg.pagefault.address, qemu_ram_get_idstr(rb), - rb_offset); + rb_offset, + msg.arg.pagefault.feat.ptid); + mark_postcopy_blocktime_begin((uintptr_t)(msg.arg.pagefault.address), + get_mem_fault_cpu_index(msg.arg.pagefault.feat.ptid)); /* * Send the request to the source - we want to request one * of our host page sizes (which is >= TPS) @@ -715,6 +799,7 @@ int postcopy_place_page(MigrationIncomingState *mis, void *host, void *from, return -e; } + mark_postcopy_blocktime_end((uint64_t)(uintptr_t)host); trace_postcopy_place_page(host); return 0; diff --git a/migration/trace-events b/migration/trace-events index b8f01a2..9424e3e 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -110,6 +110,8 @@ process_incoming_migration_co_end(int ret, int ps) "ret=%d postcopy-state=%d" process_incoming_migration_co_postcopy_end_main(void) "" migration_set_incoming_channel(void *ioc, const char *ioctype) "ioc=%p ioctype=%s" migration_set_outgoing_channel(void *ioc, const char *ioctype, const char *hostname) "ioc=%p ioctype=%s hostname=%s" +mark_postcopy_blocktime_begin(uint64_t addr, void *dd, int64_t time, int cpu) "addr 0x%" PRIx64 " dd %p time %" PRId64 " cpu %d" +mark_postcopy_blocktime_end(uint64_t addr, void *dd, int64_t time) "addr 0x%" PRIx64 " dd %p time %" PRId64 # migration/rdma.c qemu_rdma_accept_incoming_migration(void) "" @@ -186,7 +188,7 @@ postcopy_ram_enable_notify(void) "" postcopy_ram_fault_thread_entry(void) "" postcopy_ram_fault_thread_exit(void) "" postcopy_ram_fault_thread_quit(void) "" -postcopy_ram_fault_thread_request(uint64_t hostaddr, const char *ramblock, size_t offset) "Request for HVA=%" PRIx64 " rb=%s offset=%zx" +postcopy_ram_fault_thread_request(uint64_t hostaddr, const char *ramblock, size_t offset, uint32_t pid) "Request for HVA=%" PRIx64 " rb=%s offset=%zx %u" postcopy_ram_incoming_cleanup_closeuf(void) "" postcopy_ram_incoming_cleanup_entry(void) "" postcopy_ram_incoming_cleanup_exit(void) "" @@ -195,6 +197,7 @@ save_xbzrle_page_skipping(void) "" save_xbzrle_page_overflow(void) "" ram_save_iterate_big_wait(uint64_t milliconds, int iterations) "big wait: %" PRIu64 " milliseconds, %d iterations" ram_load_complete(int ret, uint64_t seq_iter) "exit_code %d seq iteration %" PRIu64 +get_mem_fault_cpu_index(uint32_t pid) "pid %u is not vCPU" # migration/exec.c migration_exec_outgoing(const char *cmd) "cmd=%s"