From patchwork Thu Aug 15 04:52:53 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 267308 Return-Path: X-Original-To: incoming-imx@patchwork.ozlabs.org Delivered-To: patchwork-incoming-imx@bilbo.ozlabs.org Received: from casper.infradead.org (unknown [IPv6:2001:770:15f::2]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 42FD42C0209 for ; Thu, 15 Aug 2013 14:54:05 +1000 (EST) Received: from merlin.infradead.org ([2001:4978:20e::2]) by casper.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1V9pZH-000702-E4; Thu, 15 Aug 2013 04:53:51 +0000 Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1V9pZF-000656-1m; Thu, 15 Aug 2013 04:53:49 +0000 Received: from inca-roads.misterjones.org ([213.251.177.50]) by merlin.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1V9pZB-00064I-Ru for linux-arm-kernel@lists.infradead.org; Thu, 15 Aug 2013 04:53:46 +0000 Received: from www-data by cheepnis.misterjones.org with local (Exim 4.80) (envelope-from ) id 1V9pYM-0001Ym-AV; Thu, 15 Aug 2013 06:52:54 +0200 To: Anup Patel Subject: Re: [PATCH] ARM64: KVM: Fix =?UTF-8?Q?coherent=5Ficache=5Fguest?= =?UTF-8?Q?=5Fpage=28=29=20for=20host=20with=20external=20L=33-cache=2E?= X-PHP-Originating-Script: 0:main.inc MIME-Version: 1.0 Date: Thu, 15 Aug 2013 05:52:53 +0100 From: Marc Zyngier Organization: ARM Ltd In-Reply-To: References: <1376480832-18705-1-git-send-email-pranavkumar@linaro.org> Message-ID: X-Sender: marc.zyngier@arm.com User-Agent: Roundcube Webmail/0.7.2 X-SA-Exim-Connect-IP: X-SA-Exim-Rcpt-To: anup@brainfault.org, linaro-kernel@lists.linaro.org, patches@linaro.org, catalin.marinas@arm.com, will.deacon@arm.com, kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org X-SA-Exim-Mail-From: marc.zyngier@arm.com X-SA-Exim-Scanned: No (on cheepnis.misterjones.org); SAEximRunCond expanded to false X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20130815_005345_987469_E0A4B2CF X-CRM114-Status: GOOD ( 33.78 ) X-Spam-Score: -1.2 (-) X-Spam-Report: SpamAssassin version 3.3.2 on merlin.infradead.org summary: Content analysis details: (-1.2 points) pts rule name description ---- ---------------------- -------------------------------------------------- 0.7 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] Cc: linaro-kernel@lists.linaro.org, patches@linaro.org, Catalin Marinas , Will Deacon , kvmarm@lists.cs.columbia.edu, linux-arm-kernel X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.15 Precedence: list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+incoming-imx=patchwork.ozlabs.org@lists.infradead.org List-Id: linux-imx-kernel.lists.patchwork.ozlabs.org Hi Anup, On 2013-08-14 15:22, Anup Patel wrote: > On Wed, Aug 14, 2013 at 5:34 PM, Marc Zyngier > wrote: >> Hi Pranav, >> >> On 2013-08-14 12:47, Pranavkumar Sawargaonkar wrote: >>> Systems with large external L3-cache (few MBs), might have dirty >>> content belonging to the guest page in L3-cache. To tackle this, >>> we need to flush such dirty content from d-cache so that guest >>> will see correct contents of guest page when guest MMU is disabled. >>> >>> The patch fixes coherent_icache_guest_page() for external L3-cache. >>> >>> Signed-off-by: Pranavkumar Sawargaonkar >>> Signed-off-by: Anup Patel >>> --- >>> arch/arm64/include/asm/kvm_mmu.h | 2 ++ >>> 1 file changed, 2 insertions(+) >>> >>> diff --git a/arch/arm64/include/asm/kvm_mmu.h >>> b/arch/arm64/include/asm/kvm_mmu.h >>> index efe609c..5129038 100644 >>> --- a/arch/arm64/include/asm/kvm_mmu.h >>> +++ b/arch/arm64/include/asm/kvm_mmu.h >>> @@ -123,6 +123,8 @@ static inline void >>> coherent_icache_guest_page(struct kvm *kvm, gfn_t gfn) >>> if (!icache_is_aliasing()) { /* PIPT */ >>> unsigned long hva = gfn_to_hva(kvm, gfn); >>> flush_icache_range(hva, hva + PAGE_SIZE); >>> + /* Flush d-cache for systems with external caches. */ >>> + __flush_dcache_area((void *) hva, PAGE_SIZE); >>> } else if (!icache_is_aivivt()) { /* non ASID-tagged >>> VIVT */ >>> /* any kind of VIPT cache */ >>> __flush_icache_all(); >> >> [adding Will to the discussion as we talked about this in the past] >> >> That's definitely an issue, but I'm not sure the fix is to hit the >> data >> cache on each page mapping. It looks overkill. >> >> Wouldn't it be enough to let userspace do the cache cleaning? >> kvmtools >> knows which bits of the guest memory have been touched, and can do a >> "DC >> DVAC" on this region. > > It seems a bit unnatural to have cache cleaning is user-space. I am > sure > other architectures don't have such cache cleaning in user-space for > KVM. > >> >> The alternative is do it in the kernel before running any vcpu - but >> that's not very nice either (have to clean the whole of the guest >> memory, which makes a full dcache clean more appealing). > > Actually, cleaning full d-cache by set/way upon first run of VCPU was > our second option but current approach seemed very simple hence > we went for this. > > If more people vote for full d-cache clean upon first run of VCPU > then > we should revise this patch. Can you please give the attached patch a spin on your HW? I've boot-tested it on a model, but of course I can't really verify that it fixes your issue. As far as I can see, it should fix it without any additional flushing. Please let me know how it goes. Thanks, M. diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index a5f28e2..a71a9bf 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -60,10 +60,12 @@ /* * The bits we set in HCR: * RW: 64bit by default, can be overriden for 32bit VMs + * TVM: Trap writes to VM registers (until MMU is on) * TAC: Trap ACTLR * TSC: Trap SMC * TSW: Trap cache operations by set/way * TWI: Trap WFI + * DC: Default Cacheable (until MMU is on) * TIDCP: Trap L2CTLR/L2ECTLR * BSU_IS: Upgrade barriers to the inner shareable domain * FB: Force broadcast of all maintainance operations @@ -74,7 +76,7 @@ */ #define HCR_GUEST_FLAGS (HCR_TSC | HCR_TSW | HCR_TWI | HCR_VM | HCR_BSU_IS | \ HCR_FB | HCR_TAC | HCR_AMO | HCR_IMO | HCR_FMO | \ - HCR_SWIO | HCR_TIDCP | HCR_RW) + HCR_DC | HCR_TVM | HCR_SWIO | HCR_TIDCP | HCR_RW) #define HCR_VIRT_EXCP_MASK (HCR_VA | HCR_VI | HCR_VF) /* Hyp System Control Register (SCTLR_EL2) bits */ diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 9492360..6d81d87 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -121,6 +121,42 @@ done: } /* + * Generic accessor for VM registers. Only called as long as HCR_TVM + * is set. + */ +static bool access_vm_reg(struct kvm_vcpu *vcpu, + const struct sys_reg_params *p, + const struct sys_reg_desc *r) +{ + BUG_ON(!p->is_write); + + vcpu_sys_reg(vcpu, r->reg) = *vcpu_reg(vcpu, p->Rt); + return true; +} + +/* + * SCTLR_EL1 accessor. Only called as long as HCR_TVM is set. If the + * guest enables the MMU, we stop trapping the VM sys_regs and leave + * it in complete control of the caches. + */ +static bool access_sctlr_el1(struct kvm_vcpu *vcpu, + const struct sys_reg_params *p, + const struct sys_reg_desc *r) +{ + unsigned long val; + + BUG_ON(!p->is_write); + + val = *vcpu_reg(vcpu, p->Rt); + vcpu_sys_reg(vcpu, r->reg) = val; + + if (val & (1 << 0)) /* MMU Enabled? */ + vcpu->arch.hcr_el2 &= ~(HCR_DC | HCR_TVM); + + return true; +} + +/* * We could trap ID_DFR0 and tell the guest we don't support performance * monitoring. Unfortunately the patch to make the kernel check ID_DFR0 was * NAKed, so it will read the PMCR anyway. @@ -185,32 +221,32 @@ static const struct sys_reg_desc sys_reg_descs[] = { NULL, reset_mpidr, MPIDR_EL1 }, /* SCTLR_EL1 */ { Op0(0b11), Op1(0b000), CRn(0b0001), CRm(0b0000), Op2(0b000), - NULL, reset_val, SCTLR_EL1, 0x00C50078 }, + access_sctlr_el1, reset_val, SCTLR_EL1, 0x00C50078 }, /* CPACR_EL1 */ { Op0(0b11), Op1(0b000), CRn(0b0001), CRm(0b0000), Op2(0b010), NULL, reset_val, CPACR_EL1, 0 }, /* TTBR0_EL1 */ { Op0(0b11), Op1(0b000), CRn(0b0010), CRm(0b0000), Op2(0b000), - NULL, reset_unknown, TTBR0_EL1 }, + access_vm_reg, reset_unknown, TTBR0_EL1 }, /* TTBR1_EL1 */ { Op0(0b11), Op1(0b000), CRn(0b0010), CRm(0b0000), Op2(0b001), - NULL, reset_unknown, TTBR1_EL1 }, + access_vm_reg, reset_unknown, TTBR1_EL1 }, /* TCR_EL1 */ { Op0(0b11), Op1(0b000), CRn(0b0010), CRm(0b0000), Op2(0b010), - NULL, reset_val, TCR_EL1, 0 }, + access_vm_reg, reset_val, TCR_EL1, 0 }, /* AFSR0_EL1 */ { Op0(0b11), Op1(0b000), CRn(0b0101), CRm(0b0001), Op2(0b000), - NULL, reset_unknown, AFSR0_EL1 }, + access_vm_reg, reset_unknown, AFSR0_EL1 }, /* AFSR1_EL1 */ { Op0(0b11), Op1(0b000), CRn(0b0101), CRm(0b0001), Op2(0b001), - NULL, reset_unknown, AFSR1_EL1 }, + access_vm_reg, reset_unknown, AFSR1_EL1 }, /* ESR_EL1 */ { Op0(0b11), Op1(0b000), CRn(0b0101), CRm(0b0010), Op2(0b000), - NULL, reset_unknown, ESR_EL1 }, + access_vm_reg, reset_unknown, ESR_EL1 }, /* FAR_EL1 */ { Op0(0b11), Op1(0b000), CRn(0b0110), CRm(0b0000), Op2(0b000), - NULL, reset_unknown, FAR_EL1 }, + access_vm_reg, reset_unknown, FAR_EL1 }, /* PMINTENSET_EL1 */ { Op0(0b11), Op1(0b000), CRn(0b1001), CRm(0b1110), Op2(0b001), @@ -221,17 +257,17 @@ static const struct sys_reg_desc sys_reg_descs[] = { /* MAIR_EL1 */ { Op0(0b11), Op1(0b000), CRn(0b1010), CRm(0b0010), Op2(0b000), - NULL, reset_unknown, MAIR_EL1 }, + access_vm_reg, reset_unknown, MAIR_EL1 }, /* AMAIR_EL1 */ { Op0(0b11), Op1(0b000), CRn(0b1010), CRm(0b0011), Op2(0b000), - NULL, reset_amair_el1, AMAIR_EL1 }, + access_vm_reg, reset_amair_el1, AMAIR_EL1 }, /* VBAR_EL1 */ { Op0(0b11), Op1(0b000), CRn(0b1100), CRm(0b0000), Op2(0b000), NULL, reset_val, VBAR_EL1, 0 }, /* CONTEXTIDR_EL1 */ { Op0(0b11), Op1(0b000), CRn(0b1101), CRm(0b0000), Op2(0b001), - NULL, reset_val, CONTEXTIDR_EL1, 0 }, + access_vm_reg, reset_val, CONTEXTIDR_EL1, 0 }, /* TPIDR_EL1 */ { Op0(0b11), Op1(0b000), CRn(0b1101), CRm(0b0000), Op2(0b100), NULL, reset_unknown, TPIDR_EL1 },