Patchwork [RFC,5/5] KVM: PPC: Take the SRCU lock around memslot use

login
register
mail settings
Submitter Paul Mackerras
Date Aug. 6, 2012, 10:08 a.m.
Message ID <20120806100816.GF8980@bloggs.ozlabs.ibm.com>
Download mbox | patch
Permalink /patch/175310/
State New
Headers show

Comments

Paul Mackerras - Aug. 6, 2012, 10:08 a.m.
The generic KVM code uses SRCU (sleeping RCU) to protect accesses
to the memslots data structures against updates due to userspace
adding, modifying or removing memory slots.  We need to do that too,
both to avoid accessing stale copies of the memslots and to avoid
lockdep warnings.  This therefore adds srcu_read_lock/unlock pairs
around code that accesses and uses memslots in the Book 3S PR code
and the Book E (44x and e500) code.

Signed-off-by: Paul Mackerras <paulus@samba.org>
---
Compile-tested only.

 arch/powerpc/kvm/44x_tlb.c   |    6 ++++++
 arch/powerpc/kvm/book3s_pr.c |    6 ++++++
 arch/powerpc/kvm/e500_tlb.c  |    6 ++++++
 3 files changed, 18 insertions(+)
Marcelo Tosatti - Aug. 9, 2012, 6:27 p.m.
On Mon, Aug 06, 2012 at 08:08:16PM +1000, Paul Mackerras wrote:
> The generic KVM code uses SRCU (sleeping RCU) to protect accesses
> to the memslots data structures against updates due to userspace
> adding, modifying or removing memory slots.  We need to do that too,
> both to avoid accessing stale copies of the memslots and to avoid
> lockdep warnings.  This therefore adds srcu_read_lock/unlock pairs
> around code that accesses and uses memslots in the Book 3S PR code
> and the Book E (44x and e500) code.
> 
> Signed-off-by: Paul Mackerras <paulus@samba.org>
> ---
> Compile-tested only.
> 
>  arch/powerpc/kvm/44x_tlb.c   |    6 ++++++
>  arch/powerpc/kvm/book3s_pr.c |    6 ++++++
>  arch/powerpc/kvm/e500_tlb.c  |    6 ++++++
>  3 files changed, 18 insertions(+)

On top of the previous comment:

x86 calls srcu_read_lock at the beginning of the KVM_RUN ioctl handler
(__vcpu_run in arch/x86/kvm/x86.c), unlocks srcu on guest entry, locks
on guest exit before any potential use of memslots, and unlocks on
exit to userspace.

This has the advantage of not sprinkling srcu lock/unlock calls all over
(except from other ioctls, of course). Its low maintenance.

Perhaps doing the same on PPC is not a bad idea.

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul Mackerras - Aug. 10, 2012, 12:37 a.m.
On Thu, Aug 09, 2012 at 03:27:17PM -0300, Marcelo Tosatti wrote:
> On Mon, Aug 06, 2012 at 08:08:16PM +1000, Paul Mackerras wrote:
> > The generic KVM code uses SRCU (sleeping RCU) to protect accesses
> > to the memslots data structures against updates due to userspace
> > adding, modifying or removing memory slots.  We need to do that too,
> > both to avoid accessing stale copies of the memslots and to avoid
> > lockdep warnings.  This therefore adds srcu_read_lock/unlock pairs
> > around code that accesses and uses memslots in the Book 3S PR code
> > and the Book E (44x and e500) code.
> > 
> > Signed-off-by: Paul Mackerras <paulus@samba.org>
> > ---
> > Compile-tested only.
> > 
> >  arch/powerpc/kvm/44x_tlb.c   |    6 ++++++
> >  arch/powerpc/kvm/book3s_pr.c |    6 ++++++
> >  arch/powerpc/kvm/e500_tlb.c  |    6 ++++++
> >  3 files changed, 18 insertions(+)
> 
> On top of the previous comment:
> 
> x86 calls srcu_read_lock at the beginning of the KVM_RUN ioctl handler
> (__vcpu_run in arch/x86/kvm/x86.c), unlocks srcu on guest entry, locks
> on guest exit before any potential use of memslots, and unlocks on
> exit to userspace.
> 
> This has the advantage of not sprinkling srcu lock/unlock calls all over
> (except from other ioctls, of course). Its low maintenance.
> 
> Perhaps doing the same on PPC is not a bad idea.

Perhaps... these changes are to areas of the PPC KVM code that I don't
use or maintain, so they're really more a suggestion of one way to fix
the problem than anything else.  That's why I put RFC in the subject
line.  It would be up to Alex whether he wants to fix it like this or
by taking the SRCU lock at a higher level.

Paul.
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Graf - Aug. 10, 2012, 9:27 a.m.
On 10.08.2012, at 02:37, Paul Mackerras wrote:

> On Thu, Aug 09, 2012 at 03:27:17PM -0300, Marcelo Tosatti wrote:
>> On Mon, Aug 06, 2012 at 08:08:16PM +1000, Paul Mackerras wrote:
>>> The generic KVM code uses SRCU (sleeping RCU) to protect accesses
>>> to the memslots data structures against updates due to userspace
>>> adding, modifying or removing memory slots.  We need to do that too,
>>> both to avoid accessing stale copies of the memslots and to avoid
>>> lockdep warnings.  This therefore adds srcu_read_lock/unlock pairs
>>> around code that accesses and uses memslots in the Book 3S PR code
>>> and the Book E (44x and e500) code.
>>> 
>>> Signed-off-by: Paul Mackerras <paulus@samba.org>
>>> ---
>>> Compile-tested only.
>>> 
>>> arch/powerpc/kvm/44x_tlb.c   |    6 ++++++
>>> arch/powerpc/kvm/book3s_pr.c |    6 ++++++
>>> arch/powerpc/kvm/e500_tlb.c  |    6 ++++++
>>> 3 files changed, 18 insertions(+)
>> 
>> On top of the previous comment:
>> 
>> x86 calls srcu_read_lock at the beginning of the KVM_RUN ioctl handler
>> (__vcpu_run in arch/x86/kvm/x86.c), unlocks srcu on guest entry, locks
>> on guest exit before any potential use of memslots, and unlocks on
>> exit to userspace.
>> 
>> This has the advantage of not sprinkling srcu lock/unlock calls all over
>> (except from other ioctls, of course). Its low maintenance.
>> 
>> Perhaps doing the same on PPC is not a bad idea.
> 
> Perhaps... these changes are to areas of the PPC KVM code that I don't
> use or maintain, so they're really more a suggestion of one way to fix
> the problem than anything else.  That's why I put RFC in the subject
> line.  It would be up to Alex whether he wants to fix it like this or
> by taking the SRCU lock at a higher level.

My general approach to these things is "keep it as close to x86 as we can", because that'll make it easier for generic code to work. I don't know if the above scheme would work for you though, since you probably want the lock held in real mode, right?


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Benjamin Herrenschmidt - Aug. 15, 2012, 8:16 a.m.
On Fri, 2012-08-10 at 11:27 +0200, Alexander Graf wrote:
> > Perhaps... these changes are to areas of the PPC KVM code that I
> don't
> > use or maintain, so they're really more a suggestion of one way to
> fix
> > the problem than anything else.  That's why I put RFC in the subject
> > line.  It would be up to Alex whether he wants to fix it like this
> or
> > by taking the SRCU lock at a higher level.
> 
> My general approach to these things is "keep it as close to x86 as we
> can", because that'll make it easier for generic code to work. I don't
> know if the above scheme would work for you though, since you probably
> want the lock held in real mode, right?

Well it wouldn't in the sense that we would still have to keep it held
while running the guest on "HV" KVM.

I've had a look at whether it would be feasible to hack a variant of the
srcu_read_lock that's usable in real mode but it looks like it uses
dynamically allocated per-cpu vars and these are in some funky vmalloc
space....

We could -still- try to get to them by doing some funky page table
walking but it becomes fairly hairy. And that code would be fragile and
prone to breakage if the SRCU code changes.

So it is an option but not one I'm happy to contemplate at this point.

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/arch/powerpc/kvm/44x_tlb.c b/arch/powerpc/kvm/44x_tlb.c
index 33aa715..7dcce4e 100644
--- a/arch/powerpc/kvm/44x_tlb.c
+++ b/arch/powerpc/kvm/44x_tlb.c
@@ -22,6 +22,7 @@ 
 #include <linux/kvm.h>
 #include <linux/kvm_host.h>
 #include <linux/highmem.h>
+#include <linux/srcu.h>
 
 #include <asm/tlbflush.h>
 #include <asm/mmu-44x.h>
@@ -442,6 +443,7 @@  int kvmppc_44x_emul_tlbwe(struct kvm_vcpu *vcpu, u8 ra, u8 rs, u8 ws)
 	struct kvmppc_vcpu_44x *vcpu_44x = to_44x(vcpu);
 	struct kvmppc_44x_tlbe *tlbe;
 	unsigned int gtlb_index;
+	int srcu_idx;
 
 	gtlb_index = kvmppc_get_gpr(vcpu, ra);
 	if (gtlb_index >= KVM44x_GUEST_TLB_SIZE) {
@@ -474,6 +476,8 @@  int kvmppc_44x_emul_tlbwe(struct kvm_vcpu *vcpu, u8 ra, u8 rs, u8 ws)
 		return EMULATE_FAIL;
 	}
 
+	srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
+
 	if (tlbe_is_host_safe(vcpu, tlbe)) {
 		gva_t eaddr;
 		gpa_t gpaddr;
@@ -490,6 +494,8 @@  int kvmppc_44x_emul_tlbwe(struct kvm_vcpu *vcpu, u8 ra, u8 rs, u8 ws)
 		kvmppc_mmu_map(vcpu, eaddr, gpaddr, gtlb_index);
 	}
 
+	srcu_read_unlock(&vcpu->kvm->srcu, srcu_idx);
+
 	trace_kvm_gtlb_write(gtlb_index, tlbe->tid, tlbe->word0, tlbe->word1,
 			     tlbe->word2);
 
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index c5e0062..a786730 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -23,6 +23,7 @@ 
 #include <linux/export.h>
 #include <linux/err.h>
 #include <linux/slab.h>
+#include <linux/srcu.h>
 
 #include <asm/reg.h>
 #include <asm/cputable.h>
@@ -290,6 +291,7 @@  int kvmppc_handle_pagefault(struct kvm_run *run, struct kvm_vcpu *vcpu,
 	bool dr = (vcpu->arch.shared->msr & MSR_DR) ? true : false;
 	bool ir = (vcpu->arch.shared->msr & MSR_IR) ? true : false;
 	u64 vsid;
+	int srcu_idx;
 
 	relocated = data ? dr : ir;
 
@@ -334,6 +336,8 @@  int kvmppc_handle_pagefault(struct kvm_run *run, struct kvm_vcpu *vcpu,
 		pte.may_execute = !data;
 	}
 
+	srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
+
 	if (page_found == -ENOENT) {
 		/* Page not found in guest PTE entries */
 		struct kvmppc_book3s_shadow_vcpu *svcpu = svcpu_get(vcpu);
@@ -376,6 +380,8 @@  int kvmppc_handle_pagefault(struct kvm_run *run, struct kvm_vcpu *vcpu,
 			r = RESUME_HOST;
 	}
 
+	srcu_read_unlock(&vcpu->kvm->srcu, srcu_idx);
+
 	return r;
 }
 
diff --git a/arch/powerpc/kvm/e500_tlb.c b/arch/powerpc/kvm/e500_tlb.c
index c8f6c58..895dc31 100644
--- a/arch/powerpc/kvm/e500_tlb.c
+++ b/arch/powerpc/kvm/e500_tlb.c
@@ -28,6 +28,7 @@ 
 #include <linux/rwsem.h>
 #include <linux/vmalloc.h>
 #include <linux/hugetlb.h>
+#include <linux/srcu.h>
 #include <asm/kvm_ppc.h>
 
 #include "e500.h"
@@ -858,6 +859,7 @@  int kvmppc_e500_emul_tlbwe(struct kvm_vcpu *vcpu)
 	struct kvm_book3e_206_tlb_entry *gtlbe, stlbe;
 	int tlbsel, esel, stlbsel, sesel;
 	int recal = 0;
+	int srcu_idx;
 
 	tlbsel = get_tlb_tlbsel(vcpu);
 	esel = get_tlb_esel(vcpu, tlbsel);
@@ -890,6 +892,8 @@  int kvmppc_e500_emul_tlbwe(struct kvm_vcpu *vcpu)
 			kvmppc_set_tlb1map_range(vcpu, gtlbe);
 	}
 
+	srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
+
 	/* Invalidate shadow mappings for the about-to-be-clobbered TLBE. */
 	if (tlbe_is_host_safe(vcpu, gtlbe)) {
 		u64 eaddr;
@@ -928,6 +932,8 @@  int kvmppc_e500_emul_tlbwe(struct kvm_vcpu *vcpu)
 		write_stlbe(vcpu_e500, gtlbe, &stlbe, stlbsel, sesel);
 	}
 
+	srcu_read_unlock(&vcpu->kvm->srcu, srcu_idx);
+
 	kvmppc_set_exit_type(vcpu, EMULATED_TLBWE_EXITS);
 	return EMULATE_DONE;
 }