From patchwork Mon Jun  1 15:17:31 2015
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Jason J. Herne" <jjherne@linux.vnet.ibm.com>
X-Patchwork-Id: 479086
Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11])
	(using TLSv1 with cipher AES256-SHA (256/256 bits))
	(No client certificate requested)
	by ozlabs.org (Postfix) with ESMTPS id 838A3140FDF
	for <incoming@patchwork.ozlabs.org>;
	Tue,  2 Jun 2015 01:18:24 +1000 (AEST)
Received: from localhost ([::1]:52946 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71) (envelope-from
	<qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>)
	id 1YzRTq-0008US-Kh
	for incoming@patchwork.ozlabs.org; Mon, 01 Jun 2015 11:18:22 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:49476)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jjherne@linux.vnet.ibm.com>) id 1YzRTE-0007RR-VP
	for qemu-devel@nongnu.org; Mon, 01 Jun 2015 11:17:46 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <jjherne@linux.vnet.ibm.com>) id 1YzRT7-0008Nu-2s
	for qemu-devel@nongnu.org; Mon, 01 Jun 2015 11:17:44 -0400
Received: from e18.ny.us.ibm.com ([129.33.205.208]:40774)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jjherne@linux.vnet.ibm.com>) id 1YzRT6-0008NY-UE
	for qemu-devel@nongnu.org; Mon, 01 Jun 2015 11:17:37 -0400
Received: from /spool/local
	by e18.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use
	Only! Violators will be prosecuted
	for <qemu-devel@nongnu.org> from <jjherne@linux.vnet.ibm.com>;
	Mon, 1 Jun 2015 11:17:35 -0400
Received: from d01dlp02.pok.ibm.com (9.56.250.167)
	by e18.ny.us.ibm.com (146.89.104.205) with IBM ESMTP SMTP Gateway:
	Authorized Use Only! Violators will be prosecuted;
	Mon, 1 Jun 2015 11:17:34 -0400
Received: from b01cxnp23033.gho.pok.ibm.com (b01cxnp23033.gho.pok.ibm.com
	[9.57.198.28])
	by d01dlp02.pok.ibm.com (Postfix) with ESMTP id 0C83C6E8048
	for <qemu-devel@nongnu.org>; Mon,  1 Jun 2015 11:09:21 -0400 (EDT)
Received: from d01av03.pok.ibm.com (d01av03.pok.ibm.com [9.56.224.217])
	by b01cxnp23033.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP
	id t51FHXx965274060
	for <qemu-devel@nongnu.org>; Mon, 1 Jun 2015 15:17:33 GMT
Received: from d01av03.pok.ibm.com (localhost [127.0.0.1])
	by d01av03.pok.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id
	t51FHXnS019394
	for <qemu-devel@nongnu.org>; Mon, 1 Jun 2015 11:17:33 -0400
Received: from jason-ThinkPad-W530.endicott.ibm.com
	(jason-thinkpad-w530.endicott.ibm.com [9.60.75.219])
	by d01av03.pok.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id
	t51FHVbw019258; Mon, 1 Jun 2015 11:17:32 -0400
From: "Jason J. Herne" <jjherne@linux.vnet.ibm.com>
To: afaerber@suse.de, amit.shah@redhat.com, dgilbert@redhat.com,
	borntraeger@de.ibm.com, quintela@redhat.com, qemu-devel@nongnu.org
Date: Mon,  1 Jun 2015 11:17:31 -0400
Message-Id: <1433171851-18507-3-git-send-email-jjherne@linux.vnet.ibm.com>
X-Mailer: git-send-email 1.9.1
In-Reply-To: <1433171851-18507-1-git-send-email-jjherne@linux.vnet.ibm.com>
References: <1433171851-18507-1-git-send-email-jjherne@linux.vnet.ibm.com>
X-TM-AS-MML: disable
X-Content-Scanned: Fidelis XPS MAILER
x-cbid: 15060115-0045-0000-0000-00000051C7AF
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x
X-Received-From: 129.33.205.208
Cc: "Jason J. Herne" <jjherne@linux.vnet.ibm.com>
Subject: [Qemu-devel] [PATCH 2/2] migration: Dynamic cpu throttling for
	auto-converge
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org
Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org

Remove traditional auto-converge static 30ms throttling code and replace it
with a dynamic throttling algorithm.

Additionally, be more aggressive when deciding when to start throttling.
Previously we waited until four unproductive memory passes. Now we begin
throttling after only two unproductive memory passes. Four seemed quite
arbitrary and only waiting for two passes allows us to complete the migration
faster.

Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
---
 arch_init.c           | 95 +++++++++++++++++----------------------------------
 migration/migration.c |  9 +++++
 2 files changed, 41 insertions(+), 63 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 23d3feb..73ae494 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -111,9 +111,7 @@ int graphic_depth = 32;
 #endif
 
 const uint32_t arch_type = QEMU_ARCH;
-static bool mig_throttle_on;
 static int dirty_rate_high_cnt;
-static void check_guest_throttling(void);
 
 static uint64_t bitmap_sync_count;
 
@@ -487,6 +485,31 @@ static size_t save_page_header(QEMUFile *f, RAMBlock *block, ram_addr_t offset)
     return size;
 }
 
+/* Reduce amount of guest cpu execution to hopefully slow down memory writes.
+ * If guest dirty memory rate is reduced below the rate at which we can
+ * transfer pages to the destination then we should be able to complete
+ * migration. Some workloads dirty memory way too fast and will not effectively
+ * converge, even with auto-converge. For these workloads we will continue to
+ * increase throttling until the guest is paused long enough to complete the
+ * migration. This essentially becomes a non-live migration.
+ */
+static void mig_throttle_guest_down(void)
+{
+    CPUState *cpu;
+
+    CPU_FOREACH(cpu) {
+        /* We have not started throttling yet. Lets start it.*/
+        if (!cpu_throttle_active(cpu)) {
+            cpu_throttle_start(cpu, 0.2);
+        }
+
+        /* Throttling is already in place. Just increase the throttling rate */
+        else {
+            cpu_throttle_start(cpu, cpu_throttle_get_ratio(cpu) * 2);
+        }
+    }
+}
+
 /* Update the xbzrle cache to reflect a page that's been sent as all 0.
  * The important thing is that a stale (not-yet-0'd) page be replaced
  * by the new data.
@@ -714,21 +737,21 @@ static void migration_bitmap_sync(void)
             /* The following detection logic can be refined later. For now:
                Check to see if the dirtied bytes is 50% more than the approx.
                amount of bytes that just got transferred since the last time we
-               were in this routine. If that happens >N times (for now N==4)
-               we turn on the throttle down logic */
+               were in this routine. If that happens twice, start or increase
+               throttling */
             bytes_xfer_now = ram_bytes_transferred();
+
             if (s->dirty_pages_rate &&
                (num_dirty_pages_period * TARGET_PAGE_SIZE >
                    (bytes_xfer_now - bytes_xfer_prev)/2) &&
-               (dirty_rate_high_cnt++ > 4)) {
+               (dirty_rate_high_cnt++ >= 2)) {
                     trace_migration_throttle();
-                    mig_throttle_on = true;
                     dirty_rate_high_cnt = 0;
+                    mig_throttle_guest_down();
              }
              bytes_xfer_prev = bytes_xfer_now;
-        } else {
-             mig_throttle_on = false;
         }
+
         if (migrate_use_xbzrle()) {
             if (iterations_prev != acct_info.iterations) {
                 acct_info.xbzrle_cache_miss_rate =
@@ -1197,7 +1220,6 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
     RAMBlock *block;
     int64_t ram_bitmap_pages; /* Size of bitmap in pages, including gaps */
 
-    mig_throttle_on = false;
     dirty_rate_high_cnt = 0;
     bitmap_sync_count = 0;
     migration_bitmap_sync_init();
@@ -1301,12 +1323,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
         }
         pages_sent += pages;
         acct_info.iterations++;
-        check_guest_throttling();
-        /* we want to check in the 1st loop, just in case it was the 1st time
-           and we had to sync the dirty bitmap.
-           qemu_get_clock_ns() is a bit expensive, so we only check each some
-           iterations
-        */
+
         if ((i & 63) == 0) {
             uint64_t t1 = (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - t0) / 1000000;
             if (t1 > MAX_WAIT) {
@@ -1913,51 +1930,3 @@ TargetInfo *qmp_query_target(Error **errp)
     return info;
 }
 
-/* Stub function that's gets run on the vcpu when its brought out of the
-   VM to run inside qemu via async_run_on_cpu()*/
-static void mig_sleep_cpu(void *opq)
-{
-    qemu_mutex_unlock_iothread();
-    g_usleep(30*1000);
-    qemu_mutex_lock_iothread();
-}
-
-/* To reduce the dirty rate explicitly disallow the VCPUs from spending
-   much time in the VM. The migration thread will try to catchup.
-   Workload will experience a performance drop.
-*/
-static void mig_throttle_guest_down(void)
-{
-    CPUState *cpu;
-
-    qemu_mutex_lock_iothread();
-    CPU_FOREACH(cpu) {
-        async_run_on_cpu(cpu, mig_sleep_cpu, NULL);
-    }
-    qemu_mutex_unlock_iothread();
-}
-
-static void check_guest_throttling(void)
-{
-    static int64_t t0;
-    int64_t        t1;
-
-    if (!mig_throttle_on) {
-        return;
-    }
-
-    if (!t0)  {
-        t0 = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
-        return;
-    }
-
-    t1 = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
-
-    /* If it has been more than 40 ms since the last time the guest
-     * was throttled then do it again.
-     */
-    if (40 < (t1-t0)/1000000) {
-        mig_throttle_guest_down();
-        t0 = t1;
-    }
-}
diff --git a/migration/migration.c b/migration/migration.c
index 732d229..c9545df 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -25,6 +25,7 @@
 #include "qemu/thread.h"
 #include "qmp-commands.h"
 #include "trace.h"
+#include "qom/cpu.h"
 
 #define MAX_THROTTLE  (32 << 20)      /* Migration speed throttling */
 
@@ -731,6 +732,7 @@ int64_t migrate_xbzrle_cache_size(void)
 static void *migration_thread(void *opaque)
 {
     MigrationState *s = opaque;
+    CPUState *cpu;
     int64_t initial_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
     int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
     int64_t initial_bytes = 0;
@@ -814,6 +816,13 @@ static void *migration_thread(void *opaque)
         }
     }
 
+    /* If we enabled cpu throttling for auto-converge, turn it off. */
+    CPU_FOREACH(cpu) {
+        if (cpu_throttle_active(cpu)) {
+            cpu_throttle_stop(cpu);
+        }
+    }
+
     qemu_mutex_lock_iothread();
     if (s->state == MIGRATION_STATUS_COMPLETED) {
         int64_t end_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);