Patchwork [2/2] kvm: ppc: booke: check range page invalidation progress on page setup

login
register
mail settings
Submitter Paolo Bonzini
Date Oct. 7, 2013, 12:04 p.m.
Message ID <5252A35F.1000502@redhat.com>
Download mbox | patch
Permalink /patch/281096/
State New
Headers show

Comments

Paolo Bonzini - Oct. 7, 2013, 12:04 p.m.
Il 04/10/2013 15:38, Alexander Graf ha scritto:
> 
> On 07.08.2013, at 12:03, Bharat Bhushan wrote:
> 
>> When the MM code is invalidating a range of pages, it calls the KVM
>> kvm_mmu_notifier_invalidate_range_start() notifier function, which calls
>> kvm_unmap_hva_range(), which arranges to flush all the TLBs for guest pages.
>> However, the Linux PTEs for the range being flushed are still valid at
>> that point.  We are not supposed to establish any new references to pages
>> in the range until the ...range_end() notifier gets called.
>> The PPC-specific KVM code doesn't get any explicit notification of that;
>> instead, we are supposed to use mmu_notifier_retry() to test whether we
>> are or have been inside a range flush notifier pair while we have been
>> referencing a page.
>>
>> This patch calls the mmu_notifier_retry() while mapping the guest
>> page to ensure we are not referencing a page when in range invalidation.
>>
>> This call is inside a region locked with kvm->mmu_lock, which is the
>> same lock that is called by the KVM MMU notifier functions, thus
>> ensuring that no new notification can proceed while we are in the
>> locked region.
>>
>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> 
> Acked-by: Alexander Graf <agraf@suse.de>
> 
> Gleb, Paolo, please queue for 3.12 directly.

Here is the backport.  The second hunk has a nontrivial conflict, so
someone please give their {Tested,Reviewed,Compiled}-by.

Paolo



--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bharat Bhushan - Oct. 10, 2013, 8:32 a.m.
> -----Original Message-----
> From: Paolo Bonzini [mailto:paolo.bonzini@gmail.com] On Behalf Of Paolo Bonzini
> Sent: Monday, October 07, 2013 5:35 PM
> To: Alexander Graf
> Cc: Bhushan Bharat-R65777; Paul Mackerras; Wood Scott-B07421; kvm-
> ppc@vger.kernel.org; kvm@vger.kernel.org mailing list; Bhushan Bharat-R65777;
> Gleb Natapov
> Subject: Re: [PATCH 2/2] kvm: ppc: booke: check range page invalidation progress
> on page setup
> 
> Il 04/10/2013 15:38, Alexander Graf ha scritto:
> >
> > On 07.08.2013, at 12:03, Bharat Bhushan wrote:
> >
> >> When the MM code is invalidating a range of pages, it calls the KVM
> >> kvm_mmu_notifier_invalidate_range_start() notifier function, which calls
> >> kvm_unmap_hva_range(), which arranges to flush all the TLBs for guest pages.
> >> However, the Linux PTEs for the range being flushed are still valid at
> >> that point.  We are not supposed to establish any new references to pages
> >> in the range until the ...range_end() notifier gets called.
> >> The PPC-specific KVM code doesn't get any explicit notification of that;
> >> instead, we are supposed to use mmu_notifier_retry() to test whether we
> >> are or have been inside a range flush notifier pair while we have been
> >> referencing a page.
> >>
> >> This patch calls the mmu_notifier_retry() while mapping the guest
> >> page to ensure we are not referencing a page when in range invalidation.
> >>
> >> This call is inside a region locked with kvm->mmu_lock, which is the
> >> same lock that is called by the KVM MMU notifier functions, thus
> >> ensuring that no new notification can proceed while we are in the
> >> locked region.
> >>
> >> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> >
> > Acked-by: Alexander Graf <agraf@suse.de>
> >
> > Gleb, Paolo, please queue for 3.12 directly.
> 
> Here is the backport.  The second hunk has a nontrivial conflict, so
> someone please give their {Tested,Reviewed,Compiled}-by.

{Compiled,Reviewed}-by: Bharat Bhushan <bharat.bhushan@freescale.com>

Thanks
-Bharat

> 
> Paolo
> 
> diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
> index 1c6a9d7..c65593a 100644
> --- a/arch/powerpc/kvm/e500_mmu_host.c
> +++ b/arch/powerpc/kvm/e500_mmu_host.c
> @@ -332,6 +332,13 @@ static inline int kvmppc_e500_shadow_map(struct
> kvmppc_vcpu_e500 *vcpu_e500,
>  	unsigned long hva;
>  	int pfnmap = 0;
>  	int tsize = BOOK3E_PAGESZ_4K;
> +	int ret = 0;
> +	unsigned long mmu_seq;
> +	struct kvm *kvm = vcpu_e500->vcpu.kvm;
> +
> +	/* used to check for invalidations in progress */
> +	mmu_seq = kvm->mmu_notifier_seq;
> +	smp_rmb();
> 
>  	/*
>  	 * Translate guest physical to true physical, acquiring
> @@ -449,6 +456,12 @@ static inline int kvmppc_e500_shadow_map(struct
> kvmppc_vcpu_e500 *vcpu_e500,
>  		gvaddr &= ~((tsize_pages << PAGE_SHIFT) - 1);
>  	}
> 
> +	spin_lock(&kvm->mmu_lock);
> +	if (mmu_notifier_retry(kvm, mmu_seq)) {
> +		ret = -EAGAIN;
> +		goto out;
> +	}
> +
>  	kvmppc_e500_ref_setup(ref, gtlbe, pfn);
> 
>  	kvmppc_e500_setup_stlbe(&vcpu_e500->vcpu, gtlbe, tsize,
> @@ -457,10 +470,13 @@ static inline int kvmppc_e500_shadow_map(struct
> kvmppc_vcpu_e500 *vcpu_e500,
>  	/* Clear i-cache for new pages */
>  	kvmppc_mmu_flush_icache(pfn);
> 
> +out:
> +	spin_unlock(&kvm->mmu_lock);
> +
>  	/* Drop refcount on page, so that mmu notifiers can clear it */
>  	kvm_release_pfn_clean(pfn);
> 
> -	return 0;
> +	return ret;
>  }
> 
>  /* XXX only map the one-one case, for now use TLB0 */
> 
> 


--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paolo Bonzini - Oct. 10, 2013, 9:01 a.m.
Il 10/10/2013 10:32, Bhushan Bharat-R65777 ha scritto:
> 
> 
>> -----Original Message-----
>> From: Paolo Bonzini [mailto:paolo.bonzini@gmail.com] On Behalf Of Paolo Bonzini
>> Sent: Monday, October 07, 2013 5:35 PM
>> To: Alexander Graf
>> Cc: Bhushan Bharat-R65777; Paul Mackerras; Wood Scott-B07421; kvm-
>> ppc@vger.kernel.org; kvm@vger.kernel.org mailing list; Bhushan Bharat-R65777;
>> Gleb Natapov
>> Subject: Re: [PATCH 2/2] kvm: ppc: booke: check range page invalidation progress
>> on page setup
>>
>> Il 04/10/2013 15:38, Alexander Graf ha scritto:
>>>
>>> On 07.08.2013, at 12:03, Bharat Bhushan wrote:
>>>
>>>> When the MM code is invalidating a range of pages, it calls the KVM
>>>> kvm_mmu_notifier_invalidate_range_start() notifier function, which calls
>>>> kvm_unmap_hva_range(), which arranges to flush all the TLBs for guest pages.
>>>> However, the Linux PTEs for the range being flushed are still valid at
>>>> that point.  We are not supposed to establish any new references to pages
>>>> in the range until the ...range_end() notifier gets called.
>>>> The PPC-specific KVM code doesn't get any explicit notification of that;
>>>> instead, we are supposed to use mmu_notifier_retry() to test whether we
>>>> are or have been inside a range flush notifier pair while we have been
>>>> referencing a page.
>>>>
>>>> This patch calls the mmu_notifier_retry() while mapping the guest
>>>> page to ensure we are not referencing a page when in range invalidation.
>>>>
>>>> This call is inside a region locked with kvm->mmu_lock, which is the
>>>> same lock that is called by the KVM MMU notifier functions, thus
>>>> ensuring that no new notification can proceed while we are in the
>>>> locked region.
>>>>
>>>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
>>>
>>> Acked-by: Alexander Graf <agraf@suse.de>
>>>
>>> Gleb, Paolo, please queue for 3.12 directly.
>>
>> Here is the backport.  The second hunk has a nontrivial conflict, so
>> someone please give their {Tested,Reviewed,Compiled}-by.
> 
> {Compiled,Reviewed}-by: Bharat Bhushan <bharat.bhushan@freescale.com>

Thanks, patch on its way to Linus.

Paolo

> Thanks
> -Bharat
> 
>>
>> Paolo
>>
>> diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
>> index 1c6a9d7..c65593a 100644
>> --- a/arch/powerpc/kvm/e500_mmu_host.c
>> +++ b/arch/powerpc/kvm/e500_mmu_host.c
>> @@ -332,6 +332,13 @@ static inline int kvmppc_e500_shadow_map(struct
>> kvmppc_vcpu_e500 *vcpu_e500,
>>  	unsigned long hva;
>>  	int pfnmap = 0;
>>  	int tsize = BOOK3E_PAGESZ_4K;
>> +	int ret = 0;
>> +	unsigned long mmu_seq;
>> +	struct kvm *kvm = vcpu_e500->vcpu.kvm;
>> +
>> +	/* used to check for invalidations in progress */
>> +	mmu_seq = kvm->mmu_notifier_seq;
>> +	smp_rmb();
>>
>>  	/*
>>  	 * Translate guest physical to true physical, acquiring
>> @@ -449,6 +456,12 @@ static inline int kvmppc_e500_shadow_map(struct
>> kvmppc_vcpu_e500 *vcpu_e500,
>>  		gvaddr &= ~((tsize_pages << PAGE_SHIFT) - 1);
>>  	}
>>
>> +	spin_lock(&kvm->mmu_lock);
>> +	if (mmu_notifier_retry(kvm, mmu_seq)) {
>> +		ret = -EAGAIN;
>> +		goto out;
>> +	}
>> +
>>  	kvmppc_e500_ref_setup(ref, gtlbe, pfn);
>>
>>  	kvmppc_e500_setup_stlbe(&vcpu_e500->vcpu, gtlbe, tsize,
>> @@ -457,10 +470,13 @@ static inline int kvmppc_e500_shadow_map(struct
>> kvmppc_vcpu_e500 *vcpu_e500,
>>  	/* Clear i-cache for new pages */
>>  	kvmppc_mmu_flush_icache(pfn);
>>
>> +out:
>> +	spin_unlock(&kvm->mmu_lock);
>> +
>>  	/* Drop refcount on page, so that mmu notifiers can clear it */
>>  	kvm_release_pfn_clean(pfn);
>>
>> -	return 0;
>> +	return ret;
>>  }
>>
>>  /* XXX only map the one-one case, for now use TLB0 */
>>
>>
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
index 1c6a9d7..c65593a 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -332,6 +332,13 @@  static inline int kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500,
 	unsigned long hva;
 	int pfnmap = 0;
 	int tsize = BOOK3E_PAGESZ_4K;
+	int ret = 0;
+	unsigned long mmu_seq;
+	struct kvm *kvm = vcpu_e500->vcpu.kvm;
+
+	/* used to check for invalidations in progress */
+	mmu_seq = kvm->mmu_notifier_seq;
+	smp_rmb();
 
 	/*
 	 * Translate guest physical to true physical, acquiring
@@ -449,6 +456,12 @@  static inline int kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500,
 		gvaddr &= ~((tsize_pages << PAGE_SHIFT) - 1);
 	}
 
+	spin_lock(&kvm->mmu_lock);
+	if (mmu_notifier_retry(kvm, mmu_seq)) {
+		ret = -EAGAIN;
+		goto out;
+	}
+
 	kvmppc_e500_ref_setup(ref, gtlbe, pfn);
 
 	kvmppc_e500_setup_stlbe(&vcpu_e500->vcpu, gtlbe, tsize,
@@ -457,10 +470,13 @@  static inline int kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500,
 	/* Clear i-cache for new pages */
 	kvmppc_mmu_flush_icache(pfn);
 
+out:
+	spin_unlock(&kvm->mmu_lock);
+
 	/* Drop refcount on page, so that mmu notifiers can clear it */
 	kvm_release_pfn_clean(pfn);
 
-	return 0;
+	return ret;
 }
 
 /* XXX only map the one-one case, for now use TLB0 */