Patchwork KVM: PPC: e500mc: Relax tlb invalidation condition on vcpu schedule

login
register
mail settings
Submitter Mihai Caraman
Date June 12, 2014, 2 p.m.
Message ID <1402581610-16585-1-git-send-email-mihai.caraman@freescale.com>
Download mbox | patch
Permalink /patch/359150/
State New
Headers show

Comments

Mihai Caraman - June 12, 2014, 2 p.m.
On vcpu schedule, the condition checked for tlb pollution is too tight.
The tlb entries of one vcpu are polluted when a different vcpu from the
same partition runs in-between. Relax the current tlb invalidation
condition taking into account the lpid.

Signed-off-by: Mihai Caraman <mihai.caraman <at> freescale.com>
Cc: Scott Wood <scottwood <at> freescale.com>
---
 arch/powerpc/kvm/e500mc.c | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)
Alexander Graf - June 12, 2014, 5:04 p.m.
On 06/12/2014 04:00 PM, Mihai Caraman wrote:
> On vcpu schedule, the condition checked for tlb pollution is too tight.
> The tlb entries of one vcpu are polluted when a different vcpu from the
> same partition runs in-between. Relax the current tlb invalidation
> condition taking into account the lpid.
>
> Signed-off-by: Mihai Caraman <mihai.caraman <at> freescale.com>

Your mailer is broken? :)
This really should be an @.

I think this should work. Scott, please ack.


Alex

> Cc: Scott Wood <scottwood <at> freescale.com>
> ---
>   arch/powerpc/kvm/e500mc.c | 20 +++++++++++++++++---
>   1 file changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/kvm/e500mc.c b/arch/powerpc/kvm/e500mc.c
> index 17e4562..2e0cd69 100644
> --- a/arch/powerpc/kvm/e500mc.c
> +++ b/arch/powerpc/kvm/e500mc.c
> @@ -111,10 +111,12 @@ void kvmppc_mmu_msr_notify(struct kvm_vcpu *vcpu, u32 old_msr)
>   }
>   
>   static DEFINE_PER_CPU(struct kvm_vcpu *, last_vcpu_on_cpu);
> +static DEFINE_PER_CPU(int, last_lpid_on_cpu);
>   
>   static void kvmppc_core_vcpu_load_e500mc(struct kvm_vcpu *vcpu, int cpu)
>   {
>   	struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu);
> +	bool update_last = false, inval_tlb = false;
>   
>   	kvmppc_booke_vcpu_load(vcpu, cpu);
>   
> @@ -140,12 +142,24 @@ static void kvmppc_core_vcpu_load_e500mc(struct kvm_vcpu *vcpu, int cpu)
>   	mtspr(SPRN_GDEAR, vcpu->arch.shared->dar);
>   	mtspr(SPRN_GESR, vcpu->arch.shared->esr);
>   
> -	if (vcpu->arch.oldpir != mfspr(SPRN_PIR) ||
> -	    __get_cpu_var(last_vcpu_on_cpu) != vcpu) {
> -		kvmppc_e500_tlbil_all(vcpu_e500);
> +	if (vcpu->arch.oldpir != mfspr(SPRN_PIR)) {
> +		/* tlb entries deprecated */
> +		inval_tlb = update_last = true;
> +	} else if (__get_cpu_var(last_vcpu_on_cpu) != vcpu) {
> +		update_last = true;
> +		/* tlb entries polluted */
> +		inval_tlb = __get_cpu_var(last_lpid_on_cpu) ==
> +			    vcpu->kvm->arch.lpid;
> +	}
> +
> +	if (update_last) {
>   		__get_cpu_var(last_vcpu_on_cpu) = vcpu;
> +		__get_cpu_var(last_lpid_on_cpu) = vcpu->kvm->arch.lpid;
>   	}
>   
> +	if (inval_tlb)
> +		kvmppc_e500_tlbil_all(vcpu_e500);
> +
>   	kvmppc_load_guest_fp(vcpu);
>   }
>   

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mihai Caraman - June 13, 2014, 2:43 p.m.
> -----Original Message-----
> From: Alexander Graf [mailto:agraf@suse.de]
> Sent: Thursday, June 12, 2014 8:05 PM
> To: Caraman Mihai Claudiu-B02008
> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; linuxppc-
> dev@lists.ozlabs.org; Wood Scott-B07421
> Subject: Re: [PATCH] KVM: PPC: e500mc: Relax tlb invalidation condition
> on vcpu schedule
> 
> On 06/12/2014 04:00 PM, Mihai Caraman wrote:
> > On vcpu schedule, the condition checked for tlb pollution is too tight.
> > The tlb entries of one vcpu are polluted when a different vcpu from the
> > same partition runs in-between. Relax the current tlb invalidation
> > condition taking into account the lpid.
> >
> > Signed-off-by: Mihai Caraman <mihai.caraman <at> freescale.com>
> 
> Your mailer is broken? :)
> This really should be an @.
> 
> I think this should work. Scott, please ack.

Alex, you were right. I screwed up the patch description by inverting relax
and tight terms :) It should have been more like this:

KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule

On vcpu schedule, the condition checked for tlb pollution is too loose.
The tlb entries of a vcpu are polluted (vs stale) only when a different vcpu
within the same logical partition runs in-between. Optimize the tlb invalidation
condition taking into account the lpid.

-Mike
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Graf - June 13, 2014, 2:55 p.m.
On 13.06.14 16:43, mihai.caraman@freescale.com wrote:
>> -----Original Message-----
>> From: Alexander Graf [mailto:agraf@suse.de]
>> Sent: Thursday, June 12, 2014 8:05 PM
>> To: Caraman Mihai Claudiu-B02008
>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; linuxppc-
>> dev@lists.ozlabs.org; Wood Scott-B07421
>> Subject: Re: [PATCH] KVM: PPC: e500mc: Relax tlb invalidation condition
>> on vcpu schedule
>>
>> On 06/12/2014 04:00 PM, Mihai Caraman wrote:
>>> On vcpu schedule, the condition checked for tlb pollution is too tight.
>>> The tlb entries of one vcpu are polluted when a different vcpu from the
>>> same partition runs in-between. Relax the current tlb invalidation
>>> condition taking into account the lpid.
>>>
>>> Signed-off-by: Mihai Caraman <mihai.caraman <at> freescale.com>
>> Your mailer is broken? :)
>> This really should be an @.
>>
>> I think this should work. Scott, please ack.
> Alex, you were right. I screwed up the patch description by inverting relax
> and tight terms :) It should have been more like this:
>
> KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule
>
> On vcpu schedule, the condition checked for tlb pollution is too loose.
> The tlb entries of a vcpu are polluted (vs stale) only when a different vcpu
> within the same logical partition runs in-between. Optimize the tlb invalidation
> condition taking into account the lpid.

Can't we give every vcpu its own lpid? Or don't we trap on global 
invalidates?


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Scott Wood - June 13, 2014, 7:42 p.m.
On Fri, 2014-06-13 at 16:55 +0200, Alexander Graf wrote:
> On 13.06.14 16:43, mihai.caraman@freescale.com wrote:
> >> -----Original Message-----
> >> From: Alexander Graf [mailto:agraf@suse.de]
> >> Sent: Thursday, June 12, 2014 8:05 PM
> >> To: Caraman Mihai Claudiu-B02008
> >> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; linuxppc-
> >> dev@lists.ozlabs.org; Wood Scott-B07421
> >> Subject: Re: [PATCH] KVM: PPC: e500mc: Relax tlb invalidation condition
> >> on vcpu schedule
> >>
> >> On 06/12/2014 04:00 PM, Mihai Caraman wrote:
> >>> On vcpu schedule, the condition checked for tlb pollution is too tight.
> >>> The tlb entries of one vcpu are polluted when a different vcpu from the
> >>> same partition runs in-between. Relax the current tlb invalidation
> >>> condition taking into account the lpid.

Can you quantify the performance improvement from this?  We've had bugs
in this area before, so let's make sure it's worth it before making this
more complicated.

> >>> Signed-off-by: Mihai Caraman <mihai.caraman <at> freescale.com>
> >> Your mailer is broken? :)
> >> This really should be an @.
> >>
> >> I think this should work. Scott, please ack.
> > Alex, you were right. I screwed up the patch description by inverting relax
> > and tight terms :) It should have been more like this:
> >
> > KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule
> >
> > On vcpu schedule, the condition checked for tlb pollution is too loose.
> > The tlb entries of a vcpu are polluted (vs stale) only when a different vcpu
> > within the same logical partition runs in-between. Optimize the tlb invalidation
> > condition taking into account the lpid.
> 
> Can't we give every vcpu its own lpid? Or don't we trap on global 
> invalidates?

That would significantly increase the odds of exhausting LPIDs,
especially on large chips like t4240 with similarly large VMs.  If we
were to do that, the LPIDs would need to be dynamically assigned (like
PIDs), and should probably be a separate numberspace per physical core.

-Scott


--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Graf - June 17, 2014, 9:08 a.m.
On 13.06.14 21:42, Scott Wood wrote:
> On Fri, 2014-06-13 at 16:55 +0200, Alexander Graf wrote:
>> On 13.06.14 16:43, mihai.caraman@freescale.com wrote:
>>>> -----Original Message-----
>>>> From: Alexander Graf [mailto:agraf@suse.de]
>>>> Sent: Thursday, June 12, 2014 8:05 PM
>>>> To: Caraman Mihai Claudiu-B02008
>>>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; linuxppc-
>>>> dev@lists.ozlabs.org; Wood Scott-B07421
>>>> Subject: Re: [PATCH] KVM: PPC: e500mc: Relax tlb invalidation condition
>>>> on vcpu schedule
>>>>
>>>> On 06/12/2014 04:00 PM, Mihai Caraman wrote:
>>>>> On vcpu schedule, the condition checked for tlb pollution is too tight.
>>>>> The tlb entries of one vcpu are polluted when a different vcpu from the
>>>>> same partition runs in-between. Relax the current tlb invalidation
>>>>> condition taking into account the lpid.
> Can you quantify the performance improvement from this?  We've had bugs
> in this area before, so let's make sure it's worth it before making this
> more complicated.
>
>>>>> Signed-off-by: Mihai Caraman <mihai.caraman <at> freescale.com>
>>>> Your mailer is broken? :)
>>>> This really should be an @.
>>>>
>>>> I think this should work. Scott, please ack.
>>> Alex, you were right. I screwed up the patch description by inverting relax
>>> and tight terms :) It should have been more like this:
>>>
>>> KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule
>>>
>>> On vcpu schedule, the condition checked for tlb pollution is too loose.
>>> The tlb entries of a vcpu are polluted (vs stale) only when a different vcpu
>>> within the same logical partition runs in-between. Optimize the tlb invalidation
>>> condition taking into account the lpid.
>> Can't we give every vcpu its own lpid? Or don't we trap on global
>> invalidates?
> That would significantly increase the odds of exhausting LPIDs,
> especially on large chips like t4240 with similarly large VMs.  If we
> were to do that, the LPIDs would need to be dynamically assigned (like
> PIDs), and should probably be a separate numberspace per physical core.

True, I didn't realize we only have so few of them. It would however 
save us from most flushing as long as we have spare LPIDs available :).


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mihai Caraman - June 17, 2014, noon
> -----Original Message-----

> From: Alexander Graf [mailto:agraf@suse.de]

> Sent: Tuesday, June 17, 2014 12:09 PM

> To: Wood Scott-B07421

> Cc: Caraman Mihai Claudiu-B02008; kvm-ppc@vger.kernel.org;

> kvm@vger.kernel.org; linuxppc-dev@lists.ozlabs.org

> Subject: Re: [PATCH] KVM: PPC: e500mc: Relax tlb invalidation condition

> on vcpu schedule

> 

> 

> On 13.06.14 21:42, Scott Wood wrote:

> > On Fri, 2014-06-13 at 16:55 +0200, Alexander Graf wrote:

> >> On 13.06.14 16:43, mihai.caraman@freescale.com wrote:

> >>>> -----Original Message-----

> >>>> From: Alexander Graf [mailto:agraf@suse.de]

> >>>> Sent: Thursday, June 12, 2014 8:05 PM

> >>>> To: Caraman Mihai Claudiu-B02008

> >>>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; linuxppc-

> >>>> dev@lists.ozlabs.org; Wood Scott-B07421

> >>>> Subject: Re: [PATCH] KVM: PPC: e500mc: Relax tlb invalidation

> condition

> >>>> on vcpu schedule

> >>>>

> >>>> On 06/12/2014 04:00 PM, Mihai Caraman wrote:

> >>>>> On vcpu schedule, the condition checked for tlb pollution is too

> tight.

> >>>>> The tlb entries of one vcpu are polluted when a different vcpu from

> the

> >>>>> same partition runs in-between. Relax the current tlb invalidation

> >>>>> condition taking into account the lpid.

> > Can you quantify the performance improvement from this?  We've had bugs

> > in this area before, so let's make sure it's worth it before making

> this

> > more complicated.

> >

> >>>>> Signed-off-by: Mihai Caraman <mihai.caraman <at> freescale.com>

> >>>> Your mailer is broken? :)

> >>>> This really should be an @.

> >>>>

> >>>> I think this should work. Scott, please ack.

> >>> Alex, you were right. I screwed up the patch description by inverting

> relax

> >>> and tight terms :) It should have been more like this:

> >>>

> >>> KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule

> >>>

> >>> On vcpu schedule, the condition checked for tlb pollution is too

> loose.

> >>> The tlb entries of a vcpu are polluted (vs stale) only when a

> different vcpu

> >>> within the same logical partition runs in-between. Optimize the tlb

> invalidation

> >>> condition taking into account the lpid.

> >> Can't we give every vcpu its own lpid? Or don't we trap on global

> >> invalidates?

> > That would significantly increase the odds of exhausting LPIDs,

> > especially on large chips like t4240 with similarly large VMs.  If we

> > were to do that, the LPIDs would need to be dynamically assigned (like

> > PIDs), and should probably be a separate numberspace per physical core.

> 

> True, I didn't realize we only have so few of them. It would however

> save us from most flushing as long as we have spare LPIDs available :).


Yes, we had this proposal on the table for e6500 multithreaded core. This
core lacks tlb write conditional instruction, so an OS needs to use locks
to protect itself against concurrent tlb writes executed from sibling threads.
When we expose hw treads as single-threaded vcpus (useful when the user opt
not to pin vcpus), the guest can't no longer protect itself optimally
(it can protect tlb writes across all threads but this is not acceptable).
So instead, we found a solution at hypervisor level by assigning different
logical partition ids to guest's vcpus running simultaneous on sibling hw
threads. Currently in FSL SDK we allocate two lpids to each guest.

I am also a proponent for using all LPID space (63 values) per (multi-threaded)
physical core, which will lead to fewer invalidates on vcpu schedule and will
accommodate the solution described above.

-Mike
Scott Wood - June 17, 2014, 3:33 p.m.
On Thu, 2014-06-12 at 19:04 +0200, Alexander Graf wrote:
> On 06/12/2014 04:00 PM, Mihai Caraman wrote:
> > @@ -140,12 +142,24 @@ static void kvmppc_core_vcpu_load_e500mc(struct kvm_vcpu *vcpu, int cpu)
> >   	mtspr(SPRN_GDEAR, vcpu->arch.shared->dar);
> >   	mtspr(SPRN_GESR, vcpu->arch.shared->esr);
> >   
> > -	if (vcpu->arch.oldpir != mfspr(SPRN_PIR) ||
> > -	    __get_cpu_var(last_vcpu_on_cpu) != vcpu) {
> > -		kvmppc_e500_tlbil_all(vcpu_e500);
> > +	if (vcpu->arch.oldpir != mfspr(SPRN_PIR)) {
> > +		/* tlb entries deprecated */
> > +		inval_tlb = update_last = true;
> > +	} else if (__get_cpu_var(last_vcpu_on_cpu) != vcpu) {
> > +		update_last = true;
> > +		/* tlb entries polluted */
> > +		inval_tlb = __get_cpu_var(last_lpid_on_cpu) ==
> > +			    vcpu->kvm->arch.lpid;
> > +	}

What about the following sequence on one CPU:

LPID 1, vcpu A
LPID 2, vcpu C
LPID 1, vcpu B
LPID 2, vcpu C	doesn't invalidate
LPID 1, vcpu A  doesn't invalidate

In the last line, vcpu A last ran on this cpu (oldpir matches), but LPID
2 last ran on this cpu (last_lpid_on_cpu does not match) -- but an
invalidation has never happened since vcpu B from LPID 1 ran on this
cpu.

-Scott


--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/arch/powerpc/kvm/e500mc.c b/arch/powerpc/kvm/e500mc.c
index 17e4562..2e0cd69 100644
--- a/arch/powerpc/kvm/e500mc.c
+++ b/arch/powerpc/kvm/e500mc.c
@@ -111,10 +111,12 @@  void kvmppc_mmu_msr_notify(struct kvm_vcpu *vcpu, u32 old_msr)
 }
 
 static DEFINE_PER_CPU(struct kvm_vcpu *, last_vcpu_on_cpu);
+static DEFINE_PER_CPU(int, last_lpid_on_cpu);
 
 static void kvmppc_core_vcpu_load_e500mc(struct kvm_vcpu *vcpu, int cpu)
 {
 	struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu);
+	bool update_last = false, inval_tlb = false;
 
 	kvmppc_booke_vcpu_load(vcpu, cpu);
 
@@ -140,12 +142,24 @@  static void kvmppc_core_vcpu_load_e500mc(struct kvm_vcpu *vcpu, int cpu)
 	mtspr(SPRN_GDEAR, vcpu->arch.shared->dar);
 	mtspr(SPRN_GESR, vcpu->arch.shared->esr);
 
-	if (vcpu->arch.oldpir != mfspr(SPRN_PIR) ||
-	    __get_cpu_var(last_vcpu_on_cpu) != vcpu) {
-		kvmppc_e500_tlbil_all(vcpu_e500);
+	if (vcpu->arch.oldpir != mfspr(SPRN_PIR)) {
+		/* tlb entries deprecated */
+		inval_tlb = update_last = true;
+	} else if (__get_cpu_var(last_vcpu_on_cpu) != vcpu) {
+		update_last = true;
+		/* tlb entries polluted */
+		inval_tlb = __get_cpu_var(last_lpid_on_cpu) ==
+			    vcpu->kvm->arch.lpid;
+	}
+
+	if (update_last) {
 		__get_cpu_var(last_vcpu_on_cpu) = vcpu;
+		__get_cpu_var(last_lpid_on_cpu) = vcpu->kvm->arch.lpid;
 	}
 
+	if (inval_tlb)
+		kvmppc_e500_tlbil_all(vcpu_e500);
+
 	kvmppc_load_guest_fp(vcpu);
 }