From patchwork Thu Feb 25 08:48:47 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Kiszka X-Patchwork-Id: 46227 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [199.232.76.165]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id C38E0B6F08 for ; Thu, 25 Feb 2010 20:10:20 +1100 (EST) Received: from localhost ([127.0.0.1]:35651 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NkZfo-0008Mg-5t for incoming@patchwork.ozlabs.org; Thu, 25 Feb 2010 04:06:20 -0500 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NkZPg-0000AQ-Sw for qemu-devel@nongnu.org; Thu, 25 Feb 2010 03:49:40 -0500 Received: from [199.232.76.173] (port=34084 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NkZPf-00009h-MF for qemu-devel@nongnu.org; Thu, 25 Feb 2010 03:49:39 -0500 Received: from Debian-exim by monty-python.gnu.org with spam-scanned (Exim 4.60) (envelope-from ) id 1NkZPd-00084Y-S3 for qemu-devel@nongnu.org; Thu, 25 Feb 2010 03:49:39 -0500 Received: from fmmailgate03.web.de ([217.72.192.234]:33232) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1NkZPd-00084Q-BA for qemu-devel@nongnu.org; Thu, 25 Feb 2010 03:49:37 -0500 Received: from smtp08.web.de (fmsmtp08.dlan.cinetic.de [172.20.5.216]) by fmmailgate03.web.de (Postfix) with ESMTP id 0432E14000088; Thu, 25 Feb 2010 09:48:54 +0100 (CET) Received: from [88.65.45.119] (helo=[192.168.1.10]) by smtp08.web.de with asmtp (TLSv1:AES256-SHA:256) (WEB.DE 4.110 #314) id 1NkZOv-000673-00; Thu, 25 Feb 2010 09:48:53 +0100 Message-ID: <4B86396F.8070208@web.de> Date: Thu, 25 Feb 2010 09:48:47 +0100 From: Jan Kiszka User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); de; rv:1.8.1.12) Gecko/20080226 SUSE/2.0.0.12-1.1 Thunderbird/2.0.0.12 Mnenhy/0.7.5.666 MIME-Version: 1.0 To: Marcelo Tosatti References: <20100224231708.GB16246@amt.cnet> <4B85BA33.5080008@web.de> <20100224234935.GA17862@amt.cnet> <4B85BD22.6050209@web.de> <20100225035814.GA470@amt.cnet> In-Reply-To: <20100225035814.GA470@amt.cnet> X-Enigmail-Version: 0.95.7 X-Sender: jan.kiszka@web.de X-Provags-ID: V01U2FsdGVkX1+0wFgiL66GA5MYitdtF50V7xSHUIGEU16L+DYA A2/7QBNSwhlAS/Re8rri6Rtoip1j/TakUMd7jhXFJkZYhE53hQ 8eG/WDYMk= X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.4-2.6 Cc: Gleb Natapov , Zachary Amsden , Avi Kivity , kvm@vger.kernel.org, qemu-devel@nongnu.org Subject: [Qemu-devel] Re: [PATCH v3 07/10] qemu-kvm: Cleanup/fix TSC and PV clock writeback X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Marcelo Tosatti wrote: > On Thu, Feb 25, 2010 at 12:58:26AM +0100, Jan Kiszka wrote: >> Marcelo Tosatti wrote: >>> On Thu, Feb 25, 2010 at 12:45:55AM +0100, Jan Kiszka wrote: >>>> Marcelo Tosatti wrote: >>>>> On Wed, Feb 24, 2010 at 03:17:55PM +0100, Jan Kiszka wrote: >>>>>> Drop kvm_load_tsc in favor of level-dependent writeback in >>>>>> kvm_arch_load_regs. KVM's PV clock MSRs fall in the same category and >>>>>> should therefore only be written back on full sync. >>>>>> >>>>>> Signed-off-by: Jan Kiszka >>>>>> --- >>>>>> qemu-kvm-x86.c | 19 +++++-------------- >>>>>> qemu-kvm.h | 4 ---- >>>>>> target-i386/machine.c | 5 ----- >>>>>> 3 files changed, 5 insertions(+), 23 deletions(-) >>>>>> >>>>>> diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c >>>>>> index 840c1c9..84fd7fa 100644 >>>>>> --- a/qemu-kvm-x86.c >>>>>> +++ b/qemu-kvm-x86.c >>>>>> @@ -965,8 +965,11 @@ void kvm_arch_load_regs(CPUState *env, int level) >>>>>> set_msr_entry(&msrs[n++], MSR_LSTAR , env->lstar); >>>>>> } >>>>>> #endif >>>>>> - set_msr_entry(&msrs[n++], MSR_KVM_SYSTEM_TIME, env->system_time_msr); >>>>>> - set_msr_entry(&msrs[n++], MSR_KVM_WALL_CLOCK, env->wall_clock_msr); >>>>>> + if (level == KVM_PUT_FULL_STATE) { >>>>>> + set_msr_entry(&msrs[n++], MSR_IA32_TSC, env->tsc); >>>>>> + set_msr_entry(&msrs[n++], MSR_KVM_SYSTEM_TIME, env->system_time_msr); >>>>>> + set_msr_entry(&msrs[n++], MSR_KVM_WALL_CLOCK, env->wall_clock_msr); >>>>>> + } >>>>> As things stand today, the TSC should only be written on migration. See >>>>> 53f658b3c33616a4997ee254311b335e59063289 in the kernel. >>>> Migration and power-up - that's what this patch ensures (=> >>>> KVM_PUT_FULL_STATE). Or where do you see any problem? >>>> >>>> Jan >>>> >>> The problem is it should not write on power up (the kernel attempts >>> to synchronize the TSCs in that case, see the commit). >>> >> OK, need to read this more carefully. >> >> I do not yet understand the difference from user space POV: it tries to >> transfer the identical TSC values to all currently stopped VCPU threads. > > guest tsc = host tsc + offset > > So at the time you set_msr(TSC), the guest visible TSC starts ticking. > For SMP guests, this does not happen exactly at the same time for all > vcpus. Ouch. > >> That should not be different if we are booting a fresh VM or loading a >> complete state of a migrated image. If it does, it looks like a KVM >> kernel deficit on first glance. > > Yes it is a deficit. After migration TSCs of SMP guests go out of sync. > Zachary is working on that. > OK, so we need a workaround, ideally without reintroducing hooks. Is this one acceptable? Jan diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c index 84fd7fa..285c05a 100644 --- a/qemu-kvm-x86.c +++ b/qemu-kvm-x86.c @@ -966,7 +966,15 @@ void kvm_arch_load_regs(CPUState *env, int level) } #endif if (level == KVM_PUT_FULL_STATE) { - set_msr_entry(&msrs[n++], MSR_IA32_TSC, env->tsc); + /* + * KVM is yet unable to synchronize TSC values of multiple VCPUs on + * writeback. Until this is fixed, we only write the offset to SMP + * guests after migration, desynchronizing the VCPUs, but avoiding + * huge jump-backs that would occur without any writeback at all. + */ + if (smp_cpus == 1 || env->tsc != 0) { + set_msr_entry(&msrs[n++], MSR_IA32_TSC, env->tsc); + } set_msr_entry(&msrs[n++], MSR_KVM_SYSTEM_TIME, env->system_time_msr); set_msr_entry(&msrs[n++], MSR_KVM_WALL_CLOCK, env->wall_clock_msr); }