diff mbox

[5/6,v5] kvm: booke: clear host tlb reference flag on guest tlb invalidation

Message ID 1379570566-3715-6-git-send-email-Bharat.Bhushan@freescale.com (mailing list archive)
State Not Applicable
Headers show

Commit Message

Bharat Bhushan Sept. 19, 2013, 6:02 a.m. UTC
On booke, "struct tlbe_ref" contains host tlb mapping information
(pfn: for guest-pfn to pfn, flags: attribute associated with this mapping)
for a guest tlb entry. So when a guest creates a TLB entry then
"struct tlbe_ref" is set to point to valid "pfn" and set attributes in
"flags" field of the above said structure. When a guest TLB entry is
invalidated then flags field of corresponding "struct tlbe_ref" is
updated to point that this is no more valid, also we selectively clear
some other attribute bits, example: if E500_TLB_BITMAP was set then we clear
E500_TLB_BITMAP, if E500_TLB_TLB0 is set then we clear this.

Ideally we should clear complete "flags" as this entry is invalid and does not
have anything to re-used. The other part of the problem is that when we use
the same entry again then also we do not clear (started doing or-ing etc).

So far it was working because the selectively clearing mentioned above
actually clears "flags" what was set during TLB mapping. But the problem
starts coming when we add more attributes to this then we need to selectively
clear them and which is not needed.

This patch we do both
        - Clear "flags" when invalidating;
        - Clear "flags" when reusing same entry later

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
---
v3-> v5
 - New patch (found this issue when doing vfio-pci development)

 arch/powerpc/kvm/e500_mmu_host.c |   12 +++++++-----
 1 files changed, 7 insertions(+), 5 deletions(-)

Comments

Scott Wood Sept. 19, 2013, 9:07 p.m. UTC | #1
On Thu, 2013-09-19 at 11:32 +0530, Bharat Bhushan wrote:
> On booke, "struct tlbe_ref" contains host tlb mapping information
> (pfn: for guest-pfn to pfn, flags: attribute associated with this mapping)
> for a guest tlb entry. So when a guest creates a TLB entry then
> "struct tlbe_ref" is set to point to valid "pfn" and set attributes in
> "flags" field of the above said structure. When a guest TLB entry is
> invalidated then flags field of corresponding "struct tlbe_ref" is
> updated to point that this is no more valid, also we selectively clear
> some other attribute bits, example: if E500_TLB_BITMAP was set then we clear
> E500_TLB_BITMAP, if E500_TLB_TLB0 is set then we clear this.
> 
> Ideally we should clear complete "flags" as this entry is invalid and does not
> have anything to re-used. The other part of the problem is that when we use
> the same entry again then also we do not clear (started doing or-ing etc).
> 
> So far it was working because the selectively clearing mentioned above
> actually clears "flags" what was set during TLB mapping. But the problem
> starts coming when we add more attributes to this then we need to selectively
> clear them and which is not needed.
> 
> This patch we do both
>         - Clear "flags" when invalidating;
>         - Clear "flags" when reusing same entry later
> 
> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> ---
> v3-> v5
>  - New patch (found this issue when doing vfio-pci development)
> 
>  arch/powerpc/kvm/e500_mmu_host.c |   12 +++++++-----
>  1 files changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
> index 1c6a9d7..60f5a3c 100644
> --- a/arch/powerpc/kvm/e500_mmu_host.c
> +++ b/arch/powerpc/kvm/e500_mmu_host.c
> @@ -217,7 +217,8 @@ void inval_gtlbe_on_host(struct kvmppc_vcpu_e500 *vcpu_e500, int tlbsel,
>  		}
>  		mb();
>  		vcpu_e500->g2h_tlb1_map[esel] = 0;
> -		ref->flags &= ~(E500_TLB_BITMAP | E500_TLB_VALID);
> +		/* Clear flags as TLB is not backed by the host anymore */
> +		ref->flags = 0;
>  		local_irq_restore(flags);
>  	}

This breaks when you have both E500_TLB_BITMAP and E500_TLB_TLB0 set.

Instead, just convert the final E500_TLB_VALID clearing at the end into
ref->flags = 0, and convert the early return a few lines earlier into
conditional execution of the tlbil_one().

-Scott
Bharat Bhushan Sept. 20, 2013, 4:19 a.m. UTC | #2
> -----Original Message-----
> From: Wood Scott-B07421
> Sent: Friday, September 20, 2013 2:38 AM
> To: Bhushan Bharat-R65777
> Cc: benh@kernel.crashing.org; agraf@suse.de; paulus@samba.org;
> kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; linuxppc-dev@lists.ozlabs.org;
> Bhushan Bharat-R65777
> Subject: Re: [PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest
> tlb invalidation
> 
> On Thu, 2013-09-19 at 11:32 +0530, Bharat Bhushan wrote:
> > On booke, "struct tlbe_ref" contains host tlb mapping information
> > (pfn: for guest-pfn to pfn, flags: attribute associated with this
> > mapping) for a guest tlb entry. So when a guest creates a TLB entry
> > then "struct tlbe_ref" is set to point to valid "pfn" and set
> > attributes in "flags" field of the above said structure. When a guest
> > TLB entry is invalidated then flags field of corresponding "struct
> > tlbe_ref" is updated to point that this is no more valid, also we
> > selectively clear some other attribute bits, example: if
> > E500_TLB_BITMAP was set then we clear E500_TLB_BITMAP, if E500_TLB_TLB0 is set
> then we clear this.
> >
> > Ideally we should clear complete "flags" as this entry is invalid and
> > does not have anything to re-used. The other part of the problem is
> > that when we use the same entry again then also we do not clear (started doing
> or-ing etc).
> >
> > So far it was working because the selectively clearing mentioned above
> > actually clears "flags" what was set during TLB mapping. But the
> > problem starts coming when we add more attributes to this then we need
> > to selectively clear them and which is not needed.
> >
> > This patch we do both
> >         - Clear "flags" when invalidating;
> >         - Clear "flags" when reusing same entry later
> >
> > Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> > ---
> > v3-> v5
> >  - New patch (found this issue when doing vfio-pci development)
> >
> >  arch/powerpc/kvm/e500_mmu_host.c |   12 +++++++-----
> >  1 files changed, 7 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/powerpc/kvm/e500_mmu_host.c
> > b/arch/powerpc/kvm/e500_mmu_host.c
> > index 1c6a9d7..60f5a3c 100644
> > --- a/arch/powerpc/kvm/e500_mmu_host.c
> > +++ b/arch/powerpc/kvm/e500_mmu_host.c
> > @@ -217,7 +217,8 @@ void inval_gtlbe_on_host(struct kvmppc_vcpu_e500
> *vcpu_e500, int tlbsel,
> >  		}
> >  		mb();
> >  		vcpu_e500->g2h_tlb1_map[esel] = 0;
> > -		ref->flags &= ~(E500_TLB_BITMAP | E500_TLB_VALID);
> > +		/* Clear flags as TLB is not backed by the host anymore */
> > +		ref->flags = 0;
> >  		local_irq_restore(flags);
> >  	}
> 
> This breaks when you have both E500_TLB_BITMAP and E500_TLB_TLB0 set.

I do not see any case where we set both E500_TLB_BITMAP and E500_TLB_TLB0. Also we have not optimized that yet (keeping track of multiple shadow TLB0 entries for one guest TLB1 entry)

We uses these bit flags only for TLB1 and if size of stlbe is 4K then we set E500_TLB_TLB0  otherwise we set E500_TLB_BITMAP. Although I think that E500_TLB_BITMAP should be set only if stlbe size is less than gtlbe size.

> 
> Instead, just convert the final E500_TLB_VALID clearing at the end into
> ref->flags = 0, and convert the early return a few lines earlier into
> conditional execution of the tlbil_one().

This looks better, will send the patch shortly.

Thanks
-Bharat

> 
> -Scott
>
Scott Wood Sept. 20, 2013, 4:18 p.m. UTC | #3
On Thu, 2013-09-19 at 23:19 -0500, Bhushan Bharat-R65777 wrote:
> 
> > -----Original Message-----
> > From: Wood Scott-B07421
> > Sent: Friday, September 20, 2013 2:38 AM
> > To: Bhushan Bharat-R65777
> > Cc: benh@kernel.crashing.org; agraf@suse.de; paulus@samba.org;
> > kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; linuxppc-dev@lists.ozlabs.org;
> > Bhushan Bharat-R65777
> > Subject: Re: [PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest
> > tlb invalidation
> > 
> > This breaks when you have both E500_TLB_BITMAP and E500_TLB_TLB0 set.
> 
> I do not see any case where we set both E500_TLB_BITMAP and
> E500_TLB_TLB0.

This would happen if you have a guest TLB1 entry that is backed by some
4K pages and some larger pages (e.g. if the guest maps CCSR with one big
TLB1 and there are varying I/O passthrough regions mapped).  It's not
common, but it's possible.

>  Also we have not optimized that yet (keeping track of
> multiple shadow TLB0 entries for one guest TLB1 entry)

This is about correctness, not optimization.

> We uses these bit flags only for TLB1 and if size of stlbe is 4K then
> we set E500_TLB_TLB0  otherwise we set E500_TLB_BITMAP. Although I
> think that E500_TLB_BITMAP should be set only if stlbe size is less
> than gtlbe size.

Why?  Even if there's only one bit set in the map, we need it to keep
track of which entry was used.

-Scott
Bharat Bhushan Sept. 20, 2013, 6:04 p.m. UTC | #4
> -----Original Message-----
> From: Wood Scott-B07421
> Sent: Friday, September 20, 2013 9:48 PM
> To: Bhushan Bharat-R65777
> Cc: Wood Scott-B07421; benh@kernel.crashing.org; agraf@suse.de;
> paulus@samba.org; kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; linuxppc-
> dev@lists.ozlabs.org
> Subject: Re: [PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest
> tlb invalidation
> 
> On Thu, 2013-09-19 at 23:19 -0500, Bhushan Bharat-R65777 wrote:
> >
> > > -----Original Message-----
> > > From: Wood Scott-B07421
> > > Sent: Friday, September 20, 2013 2:38 AM
> > > To: Bhushan Bharat-R65777
> > > Cc: benh@kernel.crashing.org; agraf@suse.de; paulus@samba.org;
> > > kvm@vger.kernel.org; kvm-ppc@vger.kernel.org;
> > > linuxppc-dev@lists.ozlabs.org; Bhushan Bharat-R65777
> > > Subject: Re: [PATCH 5/6 v5] kvm: booke: clear host tlb reference
> > > flag on guest tlb invalidation
> > >
> > > This breaks when you have both E500_TLB_BITMAP and E500_TLB_TLB0 set.
> >
> > I do not see any case where we set both E500_TLB_BITMAP and
> > E500_TLB_TLB0.
> 
> This would happen if you have a guest TLB1 entry that is backed by some 4K pages
> and some larger pages (e.g. if the guest maps CCSR with one big
> TLB1 and there are varying I/O passthrough regions mapped).  It's not common,
> but it's possible.

Agree

> 
> >  Also we have not optimized that yet (keeping track of multiple shadow
> > TLB0 entries for one guest TLB1 entry)
> 
> This is about correctness, not optimization.
> 
> > We uses these bit flags only for TLB1 and if size of stlbe is 4K then
> > we set E500_TLB_TLB0  otherwise we set E500_TLB_BITMAP. Although I
> > think that E500_TLB_BITMAP should be set only if stlbe size is less
> > than gtlbe size.
> 
> Why?  Even if there's only one bit set in the map, we need it to keep track of
> which entry was used.

If there is one entry then will not this be simple/faster to not lookup bitmap and guest->host array?
A flag indicate it is 1:1 map and this is physical address.

-Bharat

> 
> -Scott
>
Scott Wood Sept. 20, 2013, 6:08 p.m. UTC | #5
On Fri, 2013-09-20 at 13:04 -0500, Bhushan Bharat-R65777 wrote:
> 
> > -----Original Message-----
> > From: Wood Scott-B07421
> > Sent: Friday, September 20, 2013 9:48 PM
> > To: Bhushan Bharat-R65777
> > Cc: Wood Scott-B07421; benh@kernel.crashing.org; agraf@suse.de;
> > paulus@samba.org; kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; linuxppc-
> > dev@lists.ozlabs.org
> > Subject: Re: [PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest
> > tlb invalidation
> > 
> > On Thu, 2013-09-19 at 23:19 -0500, Bhushan Bharat-R65777 wrote:
> > > We uses these bit flags only for TLB1 and if size of stlbe is 4K then
> > > we set E500_TLB_TLB0  otherwise we set E500_TLB_BITMAP. Although I
> > > think that E500_TLB_BITMAP should be set only if stlbe size is less
> > > than gtlbe size.
> > 
> > Why?  Even if there's only one bit set in the map, we need it to keep track of
> > which entry was used.
> 
> If there is one entry then will not this be simple/faster to not lookup bitmap and guest->host array?
> A flag indicate it is 1:1 map and this is physical address.

The difference would be negligible, and you'd have added overhead (both
runtime and complexity) of making this a special case.

-Scott
Bharat Bhushan Sept. 20, 2013, 6:11 p.m. UTC | #6
> -----Original Message-----
> From: Wood Scott-B07421
> Sent: Friday, September 20, 2013 11:38 PM
> To: Bhushan Bharat-R65777
> Cc: Wood Scott-B07421; benh@kernel.crashing.org; agraf@suse.de;
> paulus@samba.org; kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; linuxppc-
> dev@lists.ozlabs.org
> Subject: Re: [PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest
> tlb invalidation
> 
> On Fri, 2013-09-20 at 13:04 -0500, Bhushan Bharat-R65777 wrote:
> >
> > > -----Original Message-----
> > > From: Wood Scott-B07421
> > > Sent: Friday, September 20, 2013 9:48 PM
> > > To: Bhushan Bharat-R65777
> > > Cc: Wood Scott-B07421; benh@kernel.crashing.org; agraf@suse.de;
> > > paulus@samba.org; kvm@vger.kernel.org; kvm-ppc@vger.kernel.org;
> > > linuxppc- dev@lists.ozlabs.org
> > > Subject: Re: [PATCH 5/6 v5] kvm: booke: clear host tlb reference
> > > flag on guest tlb invalidation
> > >
> > > On Thu, 2013-09-19 at 23:19 -0500, Bhushan Bharat-R65777 wrote:
> > > > We uses these bit flags only for TLB1 and if size of stlbe is 4K
> > > > then we set E500_TLB_TLB0  otherwise we set E500_TLB_BITMAP.
> > > > Although I think that E500_TLB_BITMAP should be set only if stlbe
> > > > size is less than gtlbe size.
> > >
> > > Why?  Even if there's only one bit set in the map, we need it to
> > > keep track of which entry was used.
> >
> > If there is one entry then will not this be simple/faster to not lookup bitmap
> and guest->host array?
> > A flag indicate it is 1:1 map and this is physical address.
> 
> The difference would be negligible, and you'd have added overhead (both runtime
> and complexity) of making this a special case.

May be you are right , I will see if I can give a try :)
BTW I have already sent v6 of this patch.

-Bharat

> 
> -Scott
>
diff mbox

Patch

diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
index 1c6a9d7..60f5a3c 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -217,7 +217,8 @@  void inval_gtlbe_on_host(struct kvmppc_vcpu_e500 *vcpu_e500, int tlbsel,
 		}
 		mb();
 		vcpu_e500->g2h_tlb1_map[esel] = 0;
-		ref->flags &= ~(E500_TLB_BITMAP | E500_TLB_VALID);
+		/* Clear flags as TLB is not backed by the host anymore */
+		ref->flags = 0;
 		local_irq_restore(flags);
 	}
 
@@ -227,7 +228,8 @@  void inval_gtlbe_on_host(struct kvmppc_vcpu_e500 *vcpu_e500, int tlbsel,
 		 * rarely and is not worth optimizing. Invalidate everything.
 		 */
 		kvmppc_e500_tlbil_all(vcpu_e500);
-		ref->flags &= ~(E500_TLB_TLB0 | E500_TLB_VALID);
+		/* Clear flags as TLB is not backed by the host anymore */
+		ref->flags = 0;
 	}
 
 	/* Already invalidated in between */
@@ -237,8 +239,8 @@  void inval_gtlbe_on_host(struct kvmppc_vcpu_e500 *vcpu_e500, int tlbsel,
 	/* Guest tlbe is backed by at most one host tlbe per shadow pid. */
 	kvmppc_e500_tlbil_one(vcpu_e500, gtlbe);
 
-	/* Mark the TLB as not backed by the host anymore */
-	ref->flags &= ~E500_TLB_VALID;
+	/* Clear flags as TLB is not backed by the host anymore */
+	ref->flags = 0;
 }
 
 static inline int tlbe_is_writable(struct kvm_book3e_206_tlb_entry *tlbe)
@@ -251,7 +253,7 @@  static inline void kvmppc_e500_ref_setup(struct tlbe_ref *ref,
 					 pfn_t pfn)
 {
 	ref->pfn = pfn;
-	ref->flags |= E500_TLB_VALID;
+	ref->flags = E500_TLB_VALID;
 
 	if (tlbe_is_writable(gtlbe))
 		kvm_set_pfn_dirty(pfn);