Patchwork [v3,12/14] KVM: Add barriers to allow mmu_notifier_retry to be used locklessly

login
register
mail settings
Submitter Paul Mackerras
Date Dec. 12, 2011, 10:37 p.m.
Message ID <20111212223720.GM18868@bloggs.ozlabs.ibm.com>
Download mbox | patch
Permalink /patch/130922/
State New
Headers show

Comments

Paul Mackerras - Dec. 12, 2011, 10:37 p.m.
This adds an smp_wmb in kvm_mmu_notifier_invalidate_range_end() and an
smp_rmb in mmu_notifier_retry() so that mmu_notifier_retry() will give
the correct answer when called without kvm->mmu_lock being held.
PowerPC Book3S HV KVM wants to use a bitlock per guest page rather than
a single global spinlock in order to improve the scalability of updates
to the guest MMU hashed page table, and so needs this.

Signed-off-by: Paul Mackerras <paulus@samba.org>
---
 include/linux/kvm_host.h |   14 +++++++++-----
 virt/kvm/kvm_main.c      |    6 +++---
 2 files changed, 12 insertions(+), 8 deletions(-)
Alexander Graf - Dec. 19, 2011, 5:18 p.m.
On 12.12.2011, at 23:37, Paul Mackerras wrote:

> This adds an smp_wmb in kvm_mmu_notifier_invalidate_range_end() and an
> smp_rmb in mmu_notifier_retry() so that mmu_notifier_retry() will give
> the correct answer when called without kvm->mmu_lock being held.
> PowerPC Book3S HV KVM wants to use a bitlock per guest page rather than
> a single global spinlock in order to improve the scalability of updates
> to the guest MMU hashed page table, and so needs this.
> 
> Signed-off-by: Paul Mackerras <paulus@samba.org>

Avi, mind to ack?


Alex

> ---
> include/linux/kvm_host.h |   14 +++++++++-----
> virt/kvm/kvm_main.c      |    6 +++---
> 2 files changed, 12 insertions(+), 8 deletions(-)
> 
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 8c5c303..ec79a45 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -700,12 +700,16 @@ static inline int mmu_notifier_retry(struct kvm_vcpu *vcpu, unsigned long mmu_se
> 	if (unlikely(vcpu->kvm->mmu_notifier_count))
> 		return 1;
> 	/*
> -	 * Both reads happen under the mmu_lock and both values are
> -	 * modified under mmu_lock, so there's no need of smb_rmb()
> -	 * here in between, otherwise mmu_notifier_count should be
> -	 * read before mmu_notifier_seq, see
> -	 * mmu_notifier_invalidate_range_end write side.
> +	 * Ensure the read of mmu_notifier_count happens before the read
> +	 * of mmu_notifier_seq.  This interacts with the smp_wmb() in
> +	 * mmu_notifier_invalidate_range_end to make sure that the caller
> +	 * either sees the old (non-zero) value of mmu_notifier_count or
> +	 * the new (incremented) value of mmu_notifier_seq.
> +	 * PowerPC Book3s HV KVM calls this under a per-page lock
> +	 * rather than under kvm->mmu_lock, for scalability, so
> +	 * can't rely on kvm->mmu_lock to keep things ordered.
> 	 */
> +	smp_rmb();
> 	if (vcpu->kvm->mmu_notifier_seq != mmu_seq)
> 		return 1;
> 	return 0;
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index e289486..c144132 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -357,11 +357,11 @@ static void kvm_mmu_notifier_invalidate_range_end(struct mmu_notifier *mn,
> 	 * been freed.
> 	 */
> 	kvm->mmu_notifier_seq++;
> +	smp_wmb();
> 	/*
> 	 * The above sequence increase must be visible before the
> -	 * below count decrease but both values are read by the kvm
> -	 * page fault under mmu_lock spinlock so we don't need to add
> -	 * a smb_wmb() here in between the two.
> +	 * below count decrease, which is ensured by the smp_wmb above
> +	 * in conjunction with the smp_rmb in mmu_notifier_retry().
> 	 */
> 	kvm->mmu_notifier_count--;
> 	spin_unlock(&kvm->mmu_lock);
> -- 
> 1.7.7.3
> 

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Avi Kivity - Dec. 19, 2011, 5:21 p.m.
On 12/19/2011 07:18 PM, Alexander Graf wrote:
> On 12.12.2011, at 23:37, Paul Mackerras wrote:
>
> > This adds an smp_wmb in kvm_mmu_notifier_invalidate_range_end() and an
> > smp_rmb in mmu_notifier_retry() so that mmu_notifier_retry() will give
> > the correct answer when called without kvm->mmu_lock being held.
> > PowerPC Book3S HV KVM wants to use a bitlock per guest page rather than
> > a single global spinlock in order to improve the scalability of updates
> > to the guest MMU hashed page table, and so needs this.
> > 
> > Signed-off-by: Paul Mackerras <paulus@samba.org>
>
> Avi, mind to ack?
>

Acked-by: Avi Kivity <avi@redhat.com>

Patch

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 8c5c303..ec79a45 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -700,12 +700,16 @@  static inline int mmu_notifier_retry(struct kvm_vcpu *vcpu, unsigned long mmu_se
 	if (unlikely(vcpu->kvm->mmu_notifier_count))
 		return 1;
 	/*
-	 * Both reads happen under the mmu_lock and both values are
-	 * modified under mmu_lock, so there's no need of smb_rmb()
-	 * here in between, otherwise mmu_notifier_count should be
-	 * read before mmu_notifier_seq, see
-	 * mmu_notifier_invalidate_range_end write side.
+	 * Ensure the read of mmu_notifier_count happens before the read
+	 * of mmu_notifier_seq.  This interacts with the smp_wmb() in
+	 * mmu_notifier_invalidate_range_end to make sure that the caller
+	 * either sees the old (non-zero) value of mmu_notifier_count or
+	 * the new (incremented) value of mmu_notifier_seq.
+	 * PowerPC Book3s HV KVM calls this under a per-page lock
+	 * rather than under kvm->mmu_lock, for scalability, so
+	 * can't rely on kvm->mmu_lock to keep things ordered.
 	 */
+	smp_rmb();
 	if (vcpu->kvm->mmu_notifier_seq != mmu_seq)
 		return 1;
 	return 0;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index e289486..c144132 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -357,11 +357,11 @@  static void kvm_mmu_notifier_invalidate_range_end(struct mmu_notifier *mn,
 	 * been freed.
 	 */
 	kvm->mmu_notifier_seq++;
+	smp_wmb();
 	/*
 	 * The above sequence increase must be visible before the
-	 * below count decrease but both values are read by the kvm
-	 * page fault under mmu_lock spinlock so we don't need to add
-	 * a smb_wmb() here in between the two.
+	 * below count decrease, which is ensured by the smp_wmb above
+	 * in conjunction with the smp_rmb in mmu_notifier_retry().
 	 */
 	kvm->mmu_notifier_count--;
 	spin_unlock(&kvm->mmu_lock);