Patchwork [1/3] kvm/ppc/booke: Hold srcu lock when calling gfn functions

login
register
mail settings
Submitter Scott Wood
Date April 27, 2013, 12:53 a.m.
Message ID <1367024020-14204-1-git-send-email-scottwood@freescale.com>
Download mbox | patch
Permalink /patch/240065/
State New
Headers show

Comments

Scott Wood - April 27, 2013, 12:53 a.m.
KVM core expects arch code to acquire the srcu lock when calling
gfn_to_memslot and similar functions.

Signed-off-by: Scott Wood <scottwood@freescale.com>
---
 arch/powerpc/kvm/44x_tlb.c  |    5 +++++
 arch/powerpc/kvm/booke.c    |   19 +++++++++++++++++++
 arch/powerpc/kvm/e500_mmu.c |    5 +++++
 3 files changed, 29 insertions(+)
Marcelo Tosatti - May 2, 2013, 12:15 a.m.
On Fri, Apr 26, 2013 at 07:53:38PM -0500, Scott Wood wrote:
> KVM core expects arch code to acquire the srcu lock when calling
> gfn_to_memslot and similar functions.
> 
> Signed-off-by: Scott Wood <scottwood@freescale.com>
> ---
>  arch/powerpc/kvm/44x_tlb.c  |    5 +++++
>  arch/powerpc/kvm/booke.c    |   19 +++++++++++++++++++
>  arch/powerpc/kvm/e500_mmu.c |    5 +++++
>  3 files changed, 29 insertions(+)
> 
> diff --git a/arch/powerpc/kvm/44x_tlb.c b/arch/powerpc/kvm/44x_tlb.c
> index 5dd3ab4..ed03854 100644
> --- a/arch/powerpc/kvm/44x_tlb.c
> +++ b/arch/powerpc/kvm/44x_tlb.c
> @@ -441,6 +441,7 @@ int kvmppc_44x_emul_tlbwe(struct kvm_vcpu *vcpu, u8 ra, u8 rs, u8 ws)
>  	struct kvmppc_vcpu_44x *vcpu_44x = to_44x(vcpu);
>  	struct kvmppc_44x_tlbe *tlbe;
>  	unsigned int gtlb_index;
> +	int idx;
>  
>  	gtlb_index = kvmppc_get_gpr(vcpu, ra);
>  	if (gtlb_index >= KVM44x_GUEST_TLB_SIZE) {
> @@ -473,6 +474,8 @@ int kvmppc_44x_emul_tlbwe(struct kvm_vcpu *vcpu, u8 ra, u8 rs, u8 ws)
>  		return EMULATE_FAIL;
>  	}
>  
> +	idx = srcu_read_lock(&vcpu->kvm->srcu);
> +
>  	if (tlbe_is_host_safe(vcpu, tlbe)) {
>  		gva_t eaddr;
>  		gpa_t gpaddr;
> @@ -489,6 +492,8 @@ int kvmppc_44x_emul_tlbwe(struct kvm_vcpu *vcpu, u8 ra, u8 rs, u8 ws)
>  		kvmppc_mmu_map(vcpu, eaddr, gpaddr, gtlb_index);
>  	}
>  
> +	srcu_read_unlock(&vcpu->kvm->srcu, idx);
> +
>  	trace_kvm_gtlb_write(gtlb_index, tlbe->tid, tlbe->word0, tlbe->word1,
>  			     tlbe->word2);
>  
> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> index 1020119..506c87d 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -832,6 +832,8 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
>  {
>  	int r = RESUME_HOST;
>  	int s;
> +	int idx = 0; /* silence bogus uninitialized warning */
> +	bool need_srcu = false;
>  
>  	/* update before a new last_exit_type is rewritten */
>  	kvmppc_update_timing_stats(vcpu);
> @@ -847,6 +849,20 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
>  	run->exit_reason = KVM_EXIT_UNKNOWN;
>  	run->ready_for_interrupt_injection = 1;
>  
> +	/*
> +	 * Don't get the srcu lock unconditionally, because kvm_ppc_pv()
> +	 * can call kvm_vcpu_block(), and kvm_ppc_pv() is shared with
> +	 * book3s, so dropping the srcu lock there would be awkward.
> +	 */
> +	switch (exit_nr) {
> +	case BOOKE_INTERRUPT_ITLB_MISS:
> +	case BOOKE_INTERRUPT_DTLB_MISS:
> +		need_srcu = true;
> +	}

This is not good practice (codepaths should either hold srcu or not hold
it, unconditionally).

Can you give more details of the issue? (not obvious)

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Scott Wood - May 2, 2013, 12:30 a.m.
On 05/01/2013 07:27:23 PM, Scott Wood wrote:
> On 05/01/2013 07:15:53 PM, Marcelo Tosatti wrote:
>> This is not good practice (codepaths should either hold srcu or not  
>> hold
>> it, unconditionally).
> 
> How is it different from moving the srcu lock into individual cases  
> of the switch?  I just did it this way to make it easier to add new  
> exception types if necessary (e.g. at the time I thought I'd end up  
> adding exceptions which lead to instruction emulation, but I ended up  
> acquiring the lock further down the path in that case).
> 
>> Can you give more details of the issue? (not obvious)
> 
> ITLB/DTLB miss call things like gfn_to_memslot() which need the lock  
> (but don't grab it themselves -- that seems like the real bad  
> practice here...).

Never mind on the parenthetical -- grabbing it themselves wouldn't work  
because they return RCU-protected data.

-Scott
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Marcelo Tosatti - May 2, 2013, 2:37 p.m.
On Wed, May 01, 2013 at 07:27:23PM -0500, Scott Wood wrote:
> On 05/01/2013 07:15:53 PM, Marcelo Tosatti wrote:
> >On Fri, Apr 26, 2013 at 07:53:38PM -0500, Scott Wood wrote:
> >> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> >> index 1020119..506c87d 100644
> >> --- a/arch/powerpc/kvm/booke.c
> >> +++ b/arch/powerpc/kvm/booke.c
> >> @@ -832,6 +832,8 @@ int kvmppc_handle_exit(struct kvm_run *run,
> >struct kvm_vcpu *vcpu,
> >>  {
> >>  	int r = RESUME_HOST;
> >>  	int s;
> >> +	int idx = 0; /* silence bogus uninitialized warning */
> >> +	bool need_srcu = false;
> >>
> >>  	/* update before a new last_exit_type is rewritten */
> >>  	kvmppc_update_timing_stats(vcpu);
> >> @@ -847,6 +849,20 @@ int kvmppc_handle_exit(struct kvm_run *run,
> >struct kvm_vcpu *vcpu,
> >>  	run->exit_reason = KVM_EXIT_UNKNOWN;
> >>  	run->ready_for_interrupt_injection = 1;
> >>
> >> +	/*
> >> +	 * Don't get the srcu lock unconditionally, because kvm_ppc_pv()
> >> +	 * can call kvm_vcpu_block(), and kvm_ppc_pv() is shared with
> >> +	 * book3s, so dropping the srcu lock there would be awkward.
> >> +	 */
> >> +	switch (exit_nr) {
> >> +	case BOOKE_INTERRUPT_ITLB_MISS:
> >> +	case BOOKE_INTERRUPT_DTLB_MISS:
> >> +		need_srcu = true;
> >> +	}
> >
> >This is not good practice (codepaths should either hold srcu or
> >not hold
> >it, unconditionally).
> 
> How is it different from moving the srcu lock into individual cases
> of the switch?  I just did it this way to make it easier to add new
> exception types if necessary (e.g. at the time I thought I'd end up
> adding exceptions which lead to instruction emulation, but I ended
> up acquiring the lock further down the path in that case).

Question: is this piece of code accessing this data structure?
Answer: it depends on a given runtime configuration.

Its confusing.

> >Can you give more details of the issue? (not obvious)
> 
> ITLB/DTLB miss call things like gfn_to_memslot() which need the lock
> (but don't grab it themselves -- that seems like the real bad
> practice here...).  The syscall exceptions can't have the SRCU lock
> held, because they call kvmppc_kvm_pv which can call
> kvm_vcpu_block() (yes, you can sleep with SRCU, but not
> indefinitely...).  kvmppc_kvm_pv is shared with book3s code, so
> adding code to drop the srcu lock there would be a problem since
> book3s doesn't hold the SRCU lock then...
> 
> -Scott

Its OK to nest srcu calls as long as there are properly ordered releases:

idx1 = srcu_read_lock()
idx2 = srcu_read_lock()

srcu_read_unlock(idx2)
srcu_read_unlock(idx1)

Is that helpful?

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/arch/powerpc/kvm/44x_tlb.c b/arch/powerpc/kvm/44x_tlb.c
index 5dd3ab4..ed03854 100644
--- a/arch/powerpc/kvm/44x_tlb.c
+++ b/arch/powerpc/kvm/44x_tlb.c
@@ -441,6 +441,7 @@  int kvmppc_44x_emul_tlbwe(struct kvm_vcpu *vcpu, u8 ra, u8 rs, u8 ws)
 	struct kvmppc_vcpu_44x *vcpu_44x = to_44x(vcpu);
 	struct kvmppc_44x_tlbe *tlbe;
 	unsigned int gtlb_index;
+	int idx;
 
 	gtlb_index = kvmppc_get_gpr(vcpu, ra);
 	if (gtlb_index >= KVM44x_GUEST_TLB_SIZE) {
@@ -473,6 +474,8 @@  int kvmppc_44x_emul_tlbwe(struct kvm_vcpu *vcpu, u8 ra, u8 rs, u8 ws)
 		return EMULATE_FAIL;
 	}
 
+	idx = srcu_read_lock(&vcpu->kvm->srcu);
+
 	if (tlbe_is_host_safe(vcpu, tlbe)) {
 		gva_t eaddr;
 		gpa_t gpaddr;
@@ -489,6 +492,8 @@  int kvmppc_44x_emul_tlbwe(struct kvm_vcpu *vcpu, u8 ra, u8 rs, u8 ws)
 		kvmppc_mmu_map(vcpu, eaddr, gpaddr, gtlb_index);
 	}
 
+	srcu_read_unlock(&vcpu->kvm->srcu, idx);
+
 	trace_kvm_gtlb_write(gtlb_index, tlbe->tid, tlbe->word0, tlbe->word1,
 			     tlbe->word2);
 
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 1020119..506c87d 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -832,6 +832,8 @@  int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 {
 	int r = RESUME_HOST;
 	int s;
+	int idx = 0; /* silence bogus uninitialized warning */
+	bool need_srcu = false;
 
 	/* update before a new last_exit_type is rewritten */
 	kvmppc_update_timing_stats(vcpu);
@@ -847,6 +849,20 @@  int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 	run->exit_reason = KVM_EXIT_UNKNOWN;
 	run->ready_for_interrupt_injection = 1;
 
+	/*
+	 * Don't get the srcu lock unconditionally, because kvm_ppc_pv()
+	 * can call kvm_vcpu_block(), and kvm_ppc_pv() is shared with
+	 * book3s, so dropping the srcu lock there would be awkward.
+	 */
+	switch (exit_nr) {
+	case BOOKE_INTERRUPT_ITLB_MISS:
+	case BOOKE_INTERRUPT_DTLB_MISS:
+		need_srcu = true;
+	}
+
+	if (need_srcu)
+		idx = srcu_read_lock(&vcpu->kvm->srcu);
+
 	switch (exit_nr) {
 	case BOOKE_INTERRUPT_MACHINE_CHECK:
 		printk("MACHINE CHECK: %lx\n", mfspr(SPRN_MCSR));
@@ -1138,6 +1154,9 @@  int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 		BUG();
 	}
 
+	if (need_srcu)
+		srcu_read_unlock(&vcpu->kvm->srcu, idx);
+
 	/*
 	 * To avoid clobbering exit_reason, only check for signals if we
 	 * aren't already exiting to userspace for some other reason.
diff --git a/arch/powerpc/kvm/e500_mmu.c b/arch/powerpc/kvm/e500_mmu.c
index c41a5a9..6d6f153 100644
--- a/arch/powerpc/kvm/e500_mmu.c
+++ b/arch/powerpc/kvm/e500_mmu.c
@@ -396,6 +396,7 @@  int kvmppc_e500_emul_tlbwe(struct kvm_vcpu *vcpu)
 	struct kvm_book3e_206_tlb_entry *gtlbe;
 	int tlbsel, esel;
 	int recal = 0;
+	int idx;
 
 	tlbsel = get_tlb_tlbsel(vcpu);
 	esel = get_tlb_esel(vcpu, tlbsel);
@@ -430,6 +431,8 @@  int kvmppc_e500_emul_tlbwe(struct kvm_vcpu *vcpu)
 			kvmppc_set_tlb1map_range(vcpu, gtlbe);
 	}
 
+	idx = srcu_read_lock(&vcpu->kvm->srcu);
+
 	/* Invalidate shadow mappings for the about-to-be-clobbered TLBE. */
 	if (tlbe_is_host_safe(vcpu, gtlbe)) {
 		u64 eaddr = get_tlb_eaddr(gtlbe);
@@ -444,6 +447,8 @@  int kvmppc_e500_emul_tlbwe(struct kvm_vcpu *vcpu)
 		kvmppc_mmu_map(vcpu, eaddr, raddr, index_of(tlbsel, esel));
 	}
 
+	srcu_read_unlock(&vcpu->kvm->srcu, idx);
+
 	kvmppc_set_exit_type(vcpu, EMULATED_TLBWE_EXITS);
 	return EMULATE_DONE;
 }