From patchwork Wed Jul 24 14:37:22 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 261422 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 97BC32C008E for ; Thu, 25 Jul 2013 00:38:20 +1000 (EST) Received: from localhost ([::1]:60584 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1V20Cj-0001Yt-8U for incoming@patchwork.ozlabs.org; Wed, 24 Jul 2013 10:38:13 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:47213) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1V20CL-0001YS-58 for qemu-devel@nongnu.org; Wed, 24 Jul 2013 10:37:58 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1V20CF-00068D-Tu for qemu-devel@nongnu.org; Wed, 24 Jul 2013 10:37:49 -0400 Received: from mail-gg0-x232.google.com ([2607:f8b0:4002:c02::232]:63333) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1V20CF-00067q-LK for qemu-devel@nongnu.org; Wed, 24 Jul 2013 10:37:43 -0400 Received: by mail-gg0-f178.google.com with SMTP id l12so58217gge.23 for ; Wed, 24 Jul 2013 07:37:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:to:cc:subject:date:message-id:x-mailer; bh=AUX9b66uI6x3HiO74/3ZjEUteCkDlsl1+na69zcEmFo=; b=uZeTCOmsbLoJzD3niDKxdFvdKseXiyorGlz6dqvXGAPjcb7ZcB8oj4y6lRK6Kf9/Gg OsqmgVI5J8K4bFXpP1X1XaV1STNBTI1oi6WkuCyV2+cvCHmELq0+Ulvfhjey6h5QCW0v 4SqvMaeIT5dgTcb0NNvHCVe9G6mXhSDsGZX9d3ks9EYDLue/SONHwdfeEuQseDJh53V9 bjRhfvAF5l0pC0EMT477j7YXv/Ci84+UmB77W31iIdo6CaKK83y02VJr6Tw733q69ozM KjVMKET6iOJJLOQX4MXOCYeZVkLPbxIn81SbV/um6DIrDHILB7b1/zvC3FiQ1dYPajbb IQ7w== X-Received: by 10.236.161.198 with SMTP id w46mr12772819yhk.31.1374676663137; Wed, 24 Jul 2013 07:37:43 -0700 (PDT) Received: from yakj.usersys.redhat.com (net-2-39-8-162.cust.dsl.vodafone.it. [2.39.8.162]) by mx.google.com with ESMTPSA id 66sm52211184yhe.20.2013.07.24.07.37.40 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Wed, 24 Jul 2013 07:37:42 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Date: Wed, 24 Jul 2013 16:37:22 +0200 Message-Id: <1374676642-2412-1-git-send-email-pbonzini@redhat.com> X-Mailer: git-send-email 1.8.3.1 X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2607:f8b0:4002:c02::232 Cc: gnatapov@redhat.com, ehabkost@redhat.com Subject: [Qemu-devel] [PATCH] kvm: migrate vPMU state X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org This requires kernel 3.10 but it is otherwise quite simple to do. The kernel pays attention to MSRs writes that are host initiated, and disables all side effects of the PMU registers (e.g. the global status MSR can be written and global overflow control MSR does not clear bits in the global status MSR). Only two bits are interesting. First, the number of general-purpose counters must be fetched from CPUID so that we do not read non-existent MSRs. It need not be part of the migration stream. Second, to avoid any possible side effects during the setting of MSRs I stop the PMU while setting the counters and event selector MSRs. Stopping the PMU snapshots the counters and ensures that no strange races can happen if the counters were saved close to their overflow value. Signed-off-by: Paolo Bonzini --- target-i386/cpu.h | 23 +++++++++++++ target-i386/kvm.c | 93 ++++++++++++++++++++++++++++++++++++++++++++++++--- target-i386/machine.c | 44 ++++++++++++++++++++++++ 3 files changed, 155 insertions(+), 5 deletions(-) diff --git a/target-i386/cpu.h b/target-i386/cpu.h index 058c57f..522eed4 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -304,6 +304,8 @@ #define MSR_TSC_ADJUST 0x0000003b #define MSR_IA32_TSCDEADLINE 0x6e0 +#define MSR_P6_PERFCTR0 0xc1 + #define MSR_MTRRcap 0xfe #define MSR_MTRRcap_VCNT 8 #define MSR_MTRRcap_FIXRANGE_SUPPORT (1 << 8) @@ -317,6 +319,8 @@ #define MSR_MCG_STATUS 0x17a #define MSR_MCG_CTL 0x17b +#define MSR_P6_EVNTSEL0 0x186 + #define MSR_IA32_PERF_STATUS 0x198 #define MSR_IA32_MISC_ENABLE 0x1a0 @@ -342,6 +346,14 @@ #define MSR_MTRRdefType 0x2ff +#define MSR_CORE_PERF_FIXED_CTR0 0x309 +#define MSR_CORE_PERF_FIXED_CTR1 0x30a +#define MSR_CORE_PERF_FIXED_CTR2 0x30b +#define MSR_CORE_PERF_FIXED_CTR_CTRL 0x38d +#define MSR_CORE_PERF_GLOBAL_STATUS 0x38e +#define MSR_CORE_PERF_GLOBAL_CTRL 0x38f +#define MSR_CORE_PERF_GLOBAL_OVF_CTRL 0x390 + #define MSR_MC0_CTL 0x400 #define MSR_MC0_STATUS 0x401 #define MSR_MC0_ADDR 0x402 @@ -720,6 +732,9 @@ typedef struct { #define CPU_NB_REGS CPU_NB_REGS32 #endif +#define MAX_FIXED_COUNTERS 3 +#define MAX_GP_COUNTERS (MSR_IA32_PERF_STATUS - MSR_P6_EVNTSEL0) + #define NB_MMU_MODES 3 typedef enum TPRAccess { @@ -814,6 +829,14 @@ typedef struct CPUX86State { uint64_t mcg_status; uint64_t msr_ia32_misc_enable; + uint64_t msr_fixed_ctr_ctrl; + uint64_t msr_global_ctrl; + uint64_t msr_global_status; + uint64_t msr_global_ovf_ctrl; + uint64_t msr_fixed_counters[MAX_FIXED_COUNTERS]; + uint64_t msr_gp_counters[MAX_GP_COUNTERS]; + uint64_t msr_gp_evtsel[MAX_GP_COUNTERS]; + /* exception/interrupt handling */ int error_code; int exception_is_int; diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 9ffb6ca..a209e8f 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -71,6 +71,9 @@ static bool has_msr_misc_enable; static bool has_msr_kvm_steal_time; static int lm_capable_kernel; +static bool has_msr_architectural_pmu; +static uint32_t num_architectural_pmu_counters; + bool kvm_allows_irq0_override(void) { return !kvm_irqchip_in_kernel() || kvm_has_gsi_routing(); @@ -579,6 +582,25 @@ int kvm_arch_init_vcpu(CPUState *cs) break; } } + + if (limit >= 0x0a) { + uint32_t ver; + + cpu_x86_cpuid(env, 0x0a, 0, &ver, &unused, &unused, &unused); + if ((ver & 0xff) > 0) { + has_msr_architectural_pmu = true; + num_architectural_pmu_counters = (ver & 0xff00) >> 8; + + /* Shouldn't be more than 32, since that's the number of bits + * available in EBX to tell us _which_ counters are available. + * Play it safe. + */ + if (num_architectural_pmu_counters > MAX_GP_COUNTERS) { + num_architectural_pmu_counters = MAX_GP_COUNTERS; + } + } + } + cpu_x86_cpuid(env, 0x80000000, 0, &limit, &unused, &unused, &unused); for (i = 0x80000000; i <= limit; i++) { @@ -1053,7 +1075,7 @@ static int kvm_put_msrs(X86CPU *cpu, int level) struct kvm_msr_entry entries[100]; } msr_data; struct kvm_msr_entry *msrs = msr_data.entries; - int n = 0; + int n = 0, i; kvm_msr_entry_set(&msrs[n++], MSR_IA32_SYSENTER_CS, env->sysenter_cs); kvm_msr_entry_set(&msrs[n++], MSR_IA32_SYSENTER_ESP, env->sysenter_esp); @@ -1095,9 +1117,8 @@ static int kvm_put_msrs(X86CPU *cpu, int level) } } /* - * The following paravirtual MSRs have side effects on the guest or are - * too heavy for normal writeback. Limit them to reset or full state - * updates. + * The following MSRs have side effects on the guest or are too heavy + * for normal writeback. Limit them to reset or full state updates. */ if (level >= KVM_PUT_RESET_STATE) { kvm_msr_entry_set(&msrs[n++], MSR_KVM_SYSTEM_TIME, @@ -1115,6 +1136,33 @@ static int kvm_put_msrs(X86CPU *cpu, int level) kvm_msr_entry_set(&msrs[n++], MSR_KVM_STEAL_TIME, env->steal_time_msr); } + if (has_msr_architectural_pmu) { + /* Stop the counter. */ + kvm_msr_entry_set(&msrs[n++], MSR_CORE_PERF_FIXED_CTR_CTRL, 0); + kvm_msr_entry_set(&msrs[n++], MSR_CORE_PERF_GLOBAL_CTRL, 0); + + /* Set the counter values. */ + for (i = 0; i < MAX_FIXED_COUNTERS; i++) { + kvm_msr_entry_set(&msrs[n++], MSR_CORE_PERF_FIXED_CTR0 + i, + env->msr_fixed_counters[i]); + } + for (i = 0; i < num_architectural_pmu_counters; i++) { + kvm_msr_entry_set(&msrs[n++], MSR_P6_PERFCTR0 + i, + env->msr_gp_counters[i]); + kvm_msr_entry_set(&msrs[n++], MSR_P6_EVNTSEL0 + i, + env->msr_gp_evtsel[i]); + } + kvm_msr_entry_set(&msrs[n++], MSR_CORE_PERF_GLOBAL_STATUS, + env->msr_global_status); + kvm_msr_entry_set(&msrs[n++], MSR_CORE_PERF_GLOBAL_OVF_CTRL, + env->msr_global_ovf_ctrl); + + /* Now start the PMU. */ + kvm_msr_entry_set(&msrs[n++], MSR_CORE_PERF_FIXED_CTR_CTRL, + env->msr_fixed_ctr_ctrl); + kvm_msr_entry_set(&msrs[n++], MSR_CORE_PERF_GLOBAL_CTRL, + env->msr_global_ctrl); + } if (hyperv_hypercall_available()) { kvm_msr_entry_set(&msrs[n++], HV_X64_MSR_GUEST_OS_ID, 0); kvm_msr_entry_set(&msrs[n++], HV_X64_MSR_HYPERCALL, 0); @@ -1371,6 +1419,19 @@ static int kvm_get_msrs(X86CPU *cpu) if (has_msr_kvm_steal_time) { msrs[n++].index = MSR_KVM_STEAL_TIME; } + if (has_msr_architectural_pmu) { + msrs[n++].index = MSR_CORE_PERF_FIXED_CTR_CTRL; + msrs[n++].index = MSR_CORE_PERF_GLOBAL_CTRL; + msrs[n++].index = MSR_CORE_PERF_GLOBAL_STATUS; + msrs[n++].index = MSR_CORE_PERF_GLOBAL_OVF_CTRL; + for (i = 0; i < MAX_FIXED_COUNTERS; i++) { + msrs[n++].index = MSR_CORE_PERF_FIXED_CTR0 + i; + } + for (i = 0; i < num_architectural_pmu_counters; i++) { + msrs[n++].index = MSR_P6_PERFCTR0 + i; + msrs[n++].index = MSR_P6_EVNTSEL0 + i; + } + } if (env->mcg_cap) { msrs[n++].index = MSR_MCG_STATUS; @@ -1387,7 +1448,8 @@ static int kvm_get_msrs(X86CPU *cpu) } for (i = 0; i < ret; i++) { - switch (msrs[i].index) { + uint32_t index = msrs[i].index; + switch (index) { case MSR_IA32_SYSENTER_CS: env->sysenter_cs = msrs[i].data; break; @@ -1459,6 +1521,27 @@ static int kvm_get_msrs(X86CPU *cpu) case MSR_KVM_STEAL_TIME: env->steal_time_msr = msrs[i].data; break; + case MSR_CORE_PERF_FIXED_CTR_CTRL: + env->msr_fixed_ctr_ctrl = msrs[i].data; + break; + case MSR_CORE_PERF_GLOBAL_CTRL: + env->msr_global_ctrl = msrs[i].data; + break; + case MSR_CORE_PERF_GLOBAL_STATUS: + env->msr_global_status = msrs[i].data; + break; + case MSR_CORE_PERF_GLOBAL_OVF_CTRL: + env->msr_global_ovf_ctrl = msrs[i].data; + break; + case MSR_CORE_PERF_FIXED_CTR0 ... MSR_CORE_PERF_FIXED_CTR0 + MAX_FIXED_COUNTERS - 1: + env->msr_fixed_counters[index - MSR_CORE_PERF_FIXED_CTR0] = msrs[i].data; + break; + case MSR_P6_PERFCTR0 ... MSR_P6_PERFCTR0 + MAX_GP_COUNTERS - 1: + env->msr_gp_counters[index - MSR_P6_PERFCTR0] = msrs[i].data; + break; + case MSR_P6_EVNTSEL0 ... MSR_P6_EVNTSEL0 + MAX_GP_COUNTERS - 1: + env->msr_gp_evtsel[index - MSR_P6_EVNTSEL0] = msrs[i].data; + break; } } diff --git a/target-i386/machine.c b/target-i386/machine.c index 3659db9..5bf9de6 100644 --- a/target-i386/machine.c +++ b/target-i386/machine.c @@ -410,6 +410,47 @@ static const VMStateDescription vmstate_msr_ia32_misc_enable = { } }; +static bool pmu_enable_needed(void *opaque) +{ + X86CPU *cpu = opaque; + CPUX86State *env = &cpu->env; + int i; + + if (env->msr_fixed_ctr_ctrl || env->msr_global_ctrl || + env->msr_global_status || env->msr_global_ovf_ctrl) { + return true; + } + for (i = 0; i < MAX_FIXED_COUNTERS; i++) { + if (env->msr_fixed_counters[i]) { + return true; + } + } + for (i = 0; i < MAX_GP_COUNTERS; i++) { + if (env->msr_gp_counters[i] || env->msr_gp_evtsel[i]) { + return true; + } + } + + return false; +} + +static const VMStateDescription vmstate_msr_architectural_pmu = { + .name = "cpu/msr_architectural_pmu", + .version_id = 1, + .minimum_version_id = 1, + .minimum_version_id_old = 1, + .fields = (VMStateField []) { + VMSTATE_UINT64(env.msr_fixed_ctr_ctrl, X86CPU), + VMSTATE_UINT64(env.msr_global_ctrl, X86CPU), + VMSTATE_UINT64(env.msr_global_status, X86CPU), + VMSTATE_UINT64(env.msr_global_ovf_ctrl, X86CPU), + VMSTATE_UINT64_ARRAY(env.msr_fixed_counters, X86CPU, MAX_FIXED_COUNTERS), + VMSTATE_UINT64_ARRAY(env.msr_gp_counters, X86CPU, MAX_GP_COUNTERS), + VMSTATE_UINT64_ARRAY(env.msr_gp_evtsel, X86CPU, MAX_GP_COUNTERS), + VMSTATE_END_OF_LIST() + } +}; + const VMStateDescription vmstate_x86_cpu = { .name = "cpu", .version_id = 12, @@ -535,6 +576,9 @@ const VMStateDescription vmstate_x86_cpu = { }, { .vmsd = &vmstate_msr_ia32_misc_enable, .needed = misc_enable_needed, + }, { + .vmsd = &vmstate_msr_architectural_pmu, + .needed = pmu_enable_needed, } , { /* empty */ }