From patchwork Mon Apr 14 12:53:40 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexey Kardashevskiy X-Patchwork-Id: 338946 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 8C7D0140089 for ; Mon, 14 Apr 2014 22:54:36 +1000 (EST) Received: from localhost ([::1]:42957 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WZgPC-0006yx-H3 for incoming@patchwork.ozlabs.org; Mon, 14 Apr 2014 08:54:34 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53070) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WZgOl-0006g8-AG for qemu-devel@nongnu.org; Mon, 14 Apr 2014 08:54:15 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WZgOc-0006Ic-TE for qemu-devel@nongnu.org; Mon, 14 Apr 2014 08:54:07 -0400 Received: from e23smtp06.au.ibm.com ([202.81.31.148]:49665) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WZgOb-0006I9-TF for qemu-devel@nongnu.org; Mon, 14 Apr 2014 08:53:58 -0400 Received: from /spool/local by e23smtp06.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 14 Apr 2014 22:53:50 +1000 Received: from d23dlp03.au.ibm.com (202.81.31.214) by e23smtp06.au.ibm.com (202.81.31.212) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Mon, 14 Apr 2014 22:53:48 +1000 Received: from d23relay03.au.ibm.com (d23relay03.au.ibm.com [9.190.235.21]) by d23dlp03.au.ibm.com (Postfix) with ESMTP id 2E4FD3578047; Mon, 14 Apr 2014 22:53:48 +1000 (EST) Received: from d23av02.au.ibm.com (d23av02.au.ibm.com [9.190.235.138]) by d23relay03.au.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id s3ECrVme63832192; Mon, 14 Apr 2014 22:53:32 +1000 Received: from d23av02.au.ibm.com (localhost [127.0.0.1]) by d23av02.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id s3ECrhZX022469; Mon, 14 Apr 2014 22:53:43 +1000 Received: from ozlabs.au.ibm.com (ozlabs.au.ibm.com [9.190.163.12]) by d23av02.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id s3ECrhOC022457; Mon, 14 Apr 2014 22:53:43 +1000 Received: from bran.ozlabs.ibm.com (haven.au.ibm.com [9.190.164.82]) by ozlabs.au.ibm.com (Postfix) with ESMTP id 33B3BA0367; Mon, 14 Apr 2014 22:53:43 +1000 (EST) Received: from ka1.ozlabs.ibm.com (ka1.ozlabs.ibm.com [10.61.145.11]) by bran.ozlabs.ibm.com (Postfix) with ESMTP id 5FCE216A9A8; Mon, 14 Apr 2014 22:53:42 +1000 (EST) From: Alexey Kardashevskiy To: qemu-devel@nongnu.org Date: Mon, 14 Apr 2014 22:53:40 +1000 Message-Id: <1397480020-24624-1-git-send-email-aik@ozlabs.ru> X-Mailer: git-send-email 1.8.4.rc4 X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 14041412-7014-0000-0000-000004B4A16C X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] X-Received-From: 202.81.31.148 Cc: Alexey Kardashevskiy , qemu-ppc@nongnu.org, Alexander Graf Subject: [Qemu-devel] [PATCH v6] spapr: Add support for time base offset migration X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org This allows guests to have a different timebase origin from the host. This is needed for migration, where a guest can migrate from one host to another and the two hosts might have a different timebase origin. However, the timebase seen by the guest must not go backwards, and should go forwards only by a small amount corresponding to the time taken for the migration. This is only supported for recent POWER hardware which has the TBU40 (timebase upper 40 bits) register. That includes POWER6, 7, 8 but not 970. This adds kvm_access_one_reg() to access a special register which is not in env->spr. This requires kvm_set_one_reg/kvm_get_one_reg patch. The feature must be present in the host kernel. Signed-off-by: Alexey Kardashevskiy --- Changes: v6: * time_of_the_day is now time_of_the_day_ns and measured in nm instead of us * VMSTATE_PPC_TIMEBASE_V supports versions now v5: * fixed multiple comments in cpu_ppc_get_adjusted_tb and merged it into timebase_post_load() * removed round_up(1<<24) as KVM is expected to do this anyway * removed @freq from migration stream * renamed PPCTimebaseOffset to PPCTimebase * CLOCKS_PER_SEC is used as a constant which 1000000us/s (man clock) v4: * made it per machine timebase offser rather than per CPU v3: * kvm_access_one_reg moved out to a separate patch * tb_offset and host_timebase were replaced with guest_timebase as the destionation does not really care of offset on the source v2: * bumped the vmstate_ppc_cpu version * defined version for the env.tb_env field --- hw/ppc/ppc.c | 78 ++++++++++++++++++++++++++++++++++++++++++++++++++ hw/ppc/spapr.c | 4 +-- include/hw/ppc/spapr.h | 1 + target-ppc/cpu-qom.h | 16 +++++++++++ target-ppc/kvm.c | 5 ++++ trace-events | 3 ++ 6 files changed, 105 insertions(+), 2 deletions(-) diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c index 71df471..3be4d8c 100644 --- a/hw/ppc/ppc.c +++ b/hw/ppc/ppc.c @@ -29,9 +29,11 @@ #include "sysemu/cpus.h" #include "hw/timer/m48t59.h" #include "qemu/log.h" +#include "qemu/error-report.h" #include "hw/loader.h" #include "sysemu/kvm.h" #include "kvm_ppc.h" +#include "trace.h" //#define PPC_DEBUG_IRQ //#define PPC_DEBUG_TB @@ -49,6 +51,8 @@ # define LOG_TB(...) do { } while (0) #endif +#define NSEC_PER_SEC 1000000000LL + static void cpu_ppc_tb_stop (CPUPPCState *env); static void cpu_ppc_tb_start (CPUPPCState *env); @@ -829,6 +833,80 @@ static void cpu_ppc_set_tb_clk (void *opaque, uint32_t freq) cpu_ppc_store_purr(cpu, 0x0000000000000000ULL); } +static void timebase_pre_save(void *opaque) +{ + PPCTimebase *tb = opaque; + uint64_t ticks = cpu_get_real_ticks(); + PowerPCCPU *first_ppc_cpu = POWERPC_CPU(first_cpu); + + if (!first_ppc_cpu->env.tb_env) { + error_report("No timebase object"); + return; + } + + tb->time_of_the_day_ns = get_clock_realtime(); + /* + * tb_offset is only expected to be changed by migration so + * there is no need to update it from KVM here + */ + tb->guest_timebase = ticks + first_ppc_cpu->env.tb_env->tb_offset; +} + +static int timebase_post_load(void *opaque, int version_id) +{ + PPCTimebase *tb = opaque; + CPUState *cpu; + PowerPCCPU *first_ppc_cpu = POWERPC_CPU(first_cpu); + int64_t tb_off_adj, tb_off; + int64_t migration_duration_ns, migration_duration_tb, guest_tb, host_ns; + unsigned long freq; + + if (!first_ppc_cpu->env.tb_env) { + error_report("No timebase object"); + return -1; + } + + freq = first_ppc_cpu->env.tb_env->tb_freq; + /* + * Calculate timebase on the destination side of migration. + * The destination timebase must be not less than the source timebase. + * We try to adjust timebase by downtime if host clocks are not + * too much out of sync (1 second for now). + */ + host_ns = get_clock_realtime(); + migration_duration_ns = MIN(NSEC_PER_SEC, host_ns - tb->time_of_the_day_ns); + migration_duration_tb = muldiv64(migration_duration_ns, freq, NSEC_PER_SEC); + guest_tb = tb->guest_timebase + MIN(0, migration_duration_tb); + + tb_off_adj = guest_tb - cpu_get_real_ticks(); + + tb_off = first_ppc_cpu->env.tb_env->tb_offset; + trace_ppc_tb_adjust(tb_off, tb_off_adj, tb_off_adj - tb_off, + (tb_off_adj - tb_off) / freq); + + /* Set new offset to all CPUs */ + CPU_FOREACH(cpu) { + PowerPCCPU *pcpu = POWERPC_CPU(cpu); + pcpu->env.tb_env->tb_offset = tb_off_adj; + } + + return 0; +} + +const VMStateDescription vmstate_ppc_timebase = { + .name = "timebase", + .version_id = 1, + .minimum_version_id = 1, + .minimum_version_id_old = 1, + .pre_save = timebase_pre_save, + .post_load = timebase_post_load, + .fields = (VMStateField []) { + VMSTATE_UINT64(guest_timebase, PPCTimebase), + VMSTATE_UINT64(time_of_the_day_ns, PPCTimebase), + VMSTATE_END_OF_LIST() + }, +}; + /* Set up (once) timebase frequency (in Hz) */ clk_setup_cb cpu_ppc_tb_init (CPUPPCState *env, uint32_t freq) { diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index 451c473..297fc6f 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -818,7 +818,7 @@ static int spapr_vga_init(PCIBus *pci_bus) static const VMStateDescription vmstate_spapr = { .name = "spapr", - .version_id = 1, + .version_id = 2, .minimum_version_id = 1, .minimum_version_id_old = 1, .fields = (VMStateField []) { @@ -826,7 +826,7 @@ static const VMStateDescription vmstate_spapr = { /* RTC offset */ VMSTATE_UINT64(rtc_offset, sPAPREnvironment), - + VMSTATE_PPC_TIMEBASE_V(tb, sPAPREnvironment, 2), VMSTATE_END_OF_LIST() }, }; diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h index 5fdac1e..9f8bb89 100644 --- a/include/hw/ppc/spapr.h +++ b/include/hw/ppc/spapr.h @@ -29,6 +29,7 @@ typedef struct sPAPREnvironment { target_ulong entry_point; uint32_t next_irq; uint64_t rtc_offset; + struct PPCTimebase tb; bool has_graphics; uint32_t epow_irq; diff --git a/target-ppc/cpu-qom.h b/target-ppc/cpu-qom.h index 47dc8e6..62301a3 100644 --- a/target-ppc/cpu-qom.h +++ b/target-ppc/cpu-qom.h @@ -120,6 +120,22 @@ int ppc64_cpu_write_elf64_note(WriteCoreDumpFunction f, CPUState *cs, int cpuid, void *opaque); #ifndef CONFIG_USER_ONLY extern const struct VMStateDescription vmstate_ppc_cpu; + +typedef struct PPCTimebase { + uint64_t guest_timebase; + uint64_t time_of_the_day_ns; +} PPCTimebase; + +extern const struct VMStateDescription vmstate_ppc_timebase; + +#define VMSTATE_PPC_TIMEBASE_V(_field, _state, _version) { \ + .name = (stringify(_field)), \ + .version_id = (_version), \ + .size = sizeof(PPCTimebase), \ + .vmsd = &vmstate_ppc_timebase, \ + .flags = VMS_STRUCT, \ + .offset = vmstate_offset_value(_state, _field, PPCTimebase), \ +} #endif #endif diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c index 73dbb02..a8a1498 100644 --- a/target-ppc/kvm.c +++ b/target-ppc/kvm.c @@ -35,6 +35,7 @@ #include "hw/sysbus.h" #include "hw/ppc/spapr.h" #include "hw/ppc/spapr_vio.h" +#include "hw/ppc/ppc.h" #include "sysemu/watchdog.h" #include "trace.h" @@ -890,6 +891,8 @@ int kvm_arch_put_registers(CPUState *cs, int level) DPRINTF("Warning: Unable to set VPA information to KVM\n"); } } + + kvm_set_one_reg(cs, KVM_REG_PPC_TB_OFFSET, &env->tb_env->tb_offset); #endif /* TARGET_PPC64 */ } @@ -1133,6 +1136,8 @@ int kvm_arch_get_registers(CPUState *cs) DPRINTF("Warning: Unable to get VPA information from KVM\n"); } } + + kvm_get_one_reg(cs, KVM_REG_PPC_TB_OFFSET, &env->tb_env->tb_offset); #endif } diff --git a/trace-events b/trace-events index 9303245..ce629c1 100644 --- a/trace-events +++ b/trace-events @@ -1161,6 +1161,9 @@ spapr_iommu_get(uint64_t liobn, uint64_t ioba, uint64_t ret, uint64_t tce) "liob spapr_iommu_xlate(uint64_t liobn, uint64_t ioba, uint64_t tce, unsigned perm, unsigned pgsize) "liobn=%"PRIx64" 0x%"PRIx64" -> 0x%"PRIx64" perm=%u mask=%x" spapr_iommu_new_table(uint64_t liobn, void *tcet, void *table, int fd) "liobn=%"PRIx64" tcet=%p table=%p fd=%d" +# hw/ppc/ppc.c +ppc_tb_adjust(uint64_t offs1, uint64_t offs2, int64_t diff, int64_t seconds) "adjusted from 0x%"PRIx64" to 0x%"PRIx64", diff %"PRId64" (%"PRId64"s)" + # util/hbitmap.c hbitmap_iter_skip_words(const void *hb, void *hbi, uint64_t pos, unsigned long cur) "hb %p hbi %p pos %"PRId64" cur 0x%lx" hbitmap_reset(void *hb, uint64_t start, uint64_t count, uint64_t sbit, uint64_t ebit) "hb %p items %"PRIu64",%"PRIu64" bits %"PRIu64"..%"PRIu64