diff mbox

[v2] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule

Message ID 1402995739-23756-1-git-send-email-mihai.caraman@freescale.com (mailing list archive)
State Superseded
Headers show

Commit Message

Mihai Caraman June 17, 2014, 9:02 a.m. UTC
On vcpu schedule, the condition checked for tlb pollution is too loose.
The tlb entries of a vcpu become polluted (vs stale) only when a different
vcpu within the same logical partition runs in-between. Optimize the tlb
invalidation condition taking into account the logical partition id.

With the new invalidation condition, a guest shows 4% performance improvement
on P5020DS while running a memory stress application with the cpu oversubscribed,
the other guest running a cpu intensive workload.

Guest - old invalidation condition
  real 3.89
  user 3.87
  sys 0.01

Guest - enhanced invalidation condition
  real 3.75
  user 3.73
  sys 0.01

Host
  real 3.70
  user 1.85
  sys 0.00

The memory stress application accesses 4KB pages backed by 75% of available
TLB0 entries:

char foo[ENTRIES][4096] __attribute__ ((aligned (4096)));

int main()
{
	char bar;
	int i, j;

	for (i = 0; i < ITERATIONS; i++)
        	for (j = 0; j < ENTRIES; j++)
            		bar = foo[j][0];

	return 0;
}

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Cc: Scott Wood <scottwood@freescale.com>
---
v2:
 - improve patch name and description
 - add performance results

 arch/powerpc/kvm/e500mc.c | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

Comments

Scott Wood June 17, 2014, 3:35 p.m. UTC | #1
On Tue, 2014-06-17 at 12:02 +0300, Mihai Caraman wrote:
> On vcpu schedule, the condition checked for tlb pollution is too loose.
> The tlb entries of a vcpu become polluted (vs stale) only when a different
> vcpu within the same logical partition runs in-between. Optimize the tlb
> invalidation condition taking into account the logical partition id.
> 
> With the new invalidation condition, a guest shows 4% performance improvement
> on P5020DS while running a memory stress application with the cpu oversubscribed,
> the other guest running a cpu intensive workload.

See
https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-June/118547.html

-Scott
Mihai Caraman June 17, 2014, 7:04 p.m. UTC | #2
> -----Original Message-----

> From: Wood Scott-B07421

> Sent: Tuesday, June 17, 2014 6:36 PM

> To: Caraman Mihai Claudiu-B02008

> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; linuxppc-

> dev@lists.ozlabs.org

> Subject: Re: [PATCH v2] KVM: PPC: e500mc: Enhance tlb invalidation

> condition on vcpu schedule

> 

> On Tue, 2014-06-17 at 12:02 +0300, Mihai Caraman wrote:

> > On vcpu schedule, the condition checked for tlb pollution is too loose.

> > The tlb entries of a vcpu become polluted (vs stale) only when a

> different

> > vcpu within the same logical partition runs in-between. Optimize the

> tlb

> > invalidation condition taking into account the logical partition id.

> >

> > With the new invalidation condition, a guest shows 4% performance

> improvement

> > on P5020DS while running a memory stress application with the cpu

> oversubscribed,

> > the other guest running a cpu intensive workload.

> 

> See

> https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-June/118547.html


Thanks. The original code needs just a simple adjustment to benefit from
this optimization, please review v3.

- Mike
Scott Wood June 17, 2014, 7:05 p.m. UTC | #3
On Tue, 2014-06-17 at 14:04 -0500, Caraman Mihai Claudiu-B02008 wrote:
> > -----Original Message-----
> > From: Wood Scott-B07421
> > Sent: Tuesday, June 17, 2014 6:36 PM
> > To: Caraman Mihai Claudiu-B02008
> > Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; linuxppc-
> > dev@lists.ozlabs.org
> > Subject: Re: [PATCH v2] KVM: PPC: e500mc: Enhance tlb invalidation
> > condition on vcpu schedule
> > 
> > On Tue, 2014-06-17 at 12:02 +0300, Mihai Caraman wrote:
> > > On vcpu schedule, the condition checked for tlb pollution is too loose.
> > > The tlb entries of a vcpu become polluted (vs stale) only when a
> > different
> > > vcpu within the same logical partition runs in-between. Optimize the
> > tlb
> > > invalidation condition taking into account the logical partition id.
> > >
> > > With the new invalidation condition, a guest shows 4% performance
> > improvement
> > > on P5020DS while running a memory stress application with the cpu
> > oversubscribed,
> > > the other guest running a cpu intensive workload.
> > 
> > See
> > https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-June/118547.html
> 
> Thanks. The original code needs just a simple adjustment to benefit from
> this optimization, please review v3.

Where is v3?  Or is it forthcoming?

-Scott
diff mbox

Patch

diff --git a/arch/powerpc/kvm/e500mc.c b/arch/powerpc/kvm/e500mc.c
index 17e4562..d3b814b0 100644
--- a/arch/powerpc/kvm/e500mc.c
+++ b/arch/powerpc/kvm/e500mc.c
@@ -111,10 +111,12 @@  void kvmppc_mmu_msr_notify(struct kvm_vcpu *vcpu, u32 old_msr)
 }
 
 static DEFINE_PER_CPU(struct kvm_vcpu *, last_vcpu_on_cpu);
+static DEFINE_PER_CPU(int, last_lpid_on_cpu);
 
 static void kvmppc_core_vcpu_load_e500mc(struct kvm_vcpu *vcpu, int cpu)
 {
 	struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu);
+	bool update_last = false, inval_tlb = false;
 
 	kvmppc_booke_vcpu_load(vcpu, cpu);
 
@@ -140,12 +142,24 @@  static void kvmppc_core_vcpu_load_e500mc(struct kvm_vcpu *vcpu, int cpu)
 	mtspr(SPRN_GDEAR, vcpu->arch.shared->dar);
 	mtspr(SPRN_GESR, vcpu->arch.shared->esr);
 
-	if (vcpu->arch.oldpir != mfspr(SPRN_PIR) ||
-	    __get_cpu_var(last_vcpu_on_cpu) != vcpu) {
-		kvmppc_e500_tlbil_all(vcpu_e500);
+	if (vcpu->arch.oldpir != mfspr(SPRN_PIR)) {
+		/* stale tlb entries */
+		inval_tlb = update_last = true;
+	} else if (__get_cpu_var(last_vcpu_on_cpu) != vcpu) {
+		update_last = true;
+		/* polluted tlb entries */
+		inval_tlb = __get_cpu_var(last_lpid_on_cpu) ==
+			    vcpu->kvm->arch.lpid;
+	}
+
+	if (update_last) {
 		__get_cpu_var(last_vcpu_on_cpu) = vcpu;
+		__get_cpu_var(last_lpid_on_cpu) = vcpu->kvm->arch.lpid;
 	}
 
+	if (inval_tlb)
+		kvmppc_e500_tlbil_all(vcpu_e500);
+
 	kvmppc_load_guest_fp(vcpu);
 }