[2/2] KVM: PPC: Book3S HV: lockless tlbie for HPT hcalls

Message ID 20180405175631.31381-3-npiggin@gmail.com
State New
Headers show
Series
  • KVM powerpc tlbie scalability improvement
Related show

Commit Message

Nicholas Piggin April 5, 2018, 5:56 p.m.
tlbies to an LPAR do not have to be serialised since POWER4,
MMU_FTR_LOCKLESS_TLBIE can be used to avoid the spin lock in
do_tlbies.

Testing was done on a POWER9 system in HPT mode, with a -smp 32 guest
in HPT mode. 32 instances of the powerpc fork benchmark from selftests
were run with --fork, and the results measured.

Without this patch, total throughput was about 13.5K/sec, and this is
the top of the host profile:

   74.52%  [k] do_tlbies
    2.95%  [k] kvmppc_book3s_hv_page_fault
    1.80%  [k] calc_checksum
    1.80%  [k] kvmppc_vcpu_run_hv
    1.49%  [k] kvmppc_run_core

After this patch, throughput was about 51K/sec, with this profile:

   21.28%  [k] do_tlbies
    5.26%  [k] kvmppc_run_core
    4.88%  [k] kvmppc_book3s_hv_page_fault
    3.30%  [k] _raw_spin_lock_irqsave
    3.25%  [k] gup_pgd_range

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv_rm_mmu.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

Comments

Nicholas Piggin April 6, 2018, 5:39 a.m. | #1
On Fri,  6 Apr 2018 03:56:31 +1000
Nicholas Piggin <npiggin@gmail.com> wrote:

> tlbies to an LPAR do not have to be serialised since POWER4,
> MMU_FTR_LOCKLESS_TLBIE can be used to avoid the spin lock in
> do_tlbies.
> 
> Testing was done on a POWER9 system in HPT mode, with a -smp 32 guest
> in HPT mode. 32 instances of the powerpc fork benchmark from selftests
> were run with --fork, and the results measured.
> 
> Without this patch, total throughput was about 13.5K/sec, and this is
> the top of the host profile:
> 
>    74.52%  [k] do_tlbies
>     2.95%  [k] kvmppc_book3s_hv_page_fault
>     1.80%  [k] calc_checksum
>     1.80%  [k] kvmppc_vcpu_run_hv
>     1.49%  [k] kvmppc_run_core
> 
> After this patch, throughput was about 51K/sec, with this profile:
> 
>    21.28%  [k] do_tlbies
>     5.26%  [k] kvmppc_run_core
>     4.88%  [k] kvmppc_book3s_hv_page_fault
>     3.30%  [k] _raw_spin_lock_irqsave
>     3.25%  [k] gup_pgd_range
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  arch/powerpc/kvm/book3s_hv_rm_mmu.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> index 78e6a392330f..0221a0f74f07 100644
> --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> @@ -439,6 +439,9 @@ static inline int try_lock_tlbie(unsigned int *lock)
>  	unsigned int tmp, old;
>  	unsigned int token = LOCK_TOKEN;
>  
> +	if (mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE))
> +		return 1;
> +
>  	asm volatile("1:lwarx	%1,0,%2\n"
>  		     "	cmpwi	cr0,%1,0\n"
>  		     "	bne	2f\n"
> @@ -452,6 +455,12 @@ static inline int try_lock_tlbie(unsigned int *lock)
>  	return old == 0;
>  }
>  
> +static inline void unlock_tlbie_after_sync(unsigned int *lock)
> +{
> +	if (mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE))
> +		return;
> +}
> +
>  static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues,
>  		      long npages, int global, bool need_sync)
>  {
> @@ -483,7 +492,7 @@ static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues,
>  		}
>  
>  		asm volatile("eieio; tlbsync; ptesync" : : : "memory");
> -		kvm->arch.tlbie_lock = 0;
> +		unlock_tlbie_after_sync(&kvm->arch.tlbie_lock);

Well that's a silly bug in the !LOCKLESS path, that was supposed
to move to unlock, of course. Will fix it up after some time for
comments.

Thanks,
Nick
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael Ellerman April 6, 2018, 6:12 a.m. | #2
Nicholas Piggin <npiggin@gmail.com> writes:
> diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> index 78e6a392330f..0221a0f74f07 100644
> --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> @@ -439,6 +439,9 @@ static inline int try_lock_tlbie(unsigned int *lock)
>  	unsigned int tmp, old;
>  	unsigned int token = LOCK_TOKEN;
>  
> +	if (mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE))
> +		return 1;
> +
>  	asm volatile("1:lwarx	%1,0,%2\n"
>  		     "	cmpwi	cr0,%1,0\n"
>  		     "	bne	2f\n"
> @@ -452,6 +455,12 @@ static inline int try_lock_tlbie(unsigned int *lock)
>  	return old == 0;
>  }
>  
> +static inline void unlock_tlbie_after_sync(unsigned int *lock)
> +{
> +	if (mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE))
> +		return;
> +}

So this is a bit hard to follow:

#define MMU_FTRS_DEFAULT_HPTE_ARCH_V2	\
	MMU_FTR_HPTE_TABLE | MMU_FTR_PPCAS_ARCH_V2
#define MMU_FTRS_POWER		MMU_FTRS_DEFAULT_HPTE_ARCH_V2
#define MMU_FTRS_PPC970		MMU_FTRS_POWER | MMU_FTR_TLBIE_CROP_VA			// does NOT
#define MMU_FTRS_POWER5		MMU_FTRS_POWER | MMU_FTR_LOCKLESS_TLBIE
#define MMU_FTRS_POWER6		MMU_FTRS_POWER5 | MMU_FTR_KERNEL_RO | MMU_FTR_68_BIT_VA	// includes lockless TLBIE
#define MMU_FTRS_POWER7		MMU_FTRS_POWER6						// includes lockless TLBIE
#define MMU_FTRS_POWER8		MMU_FTRS_POWER6						// includes lockless TLBIE
#define MMU_FTRS_POWER9		MMU_FTRS_POWER6						// includes lockless TLBIE
#define MMU_FTRS_CELL		MMU_FTRS_DEFAULT_HPTE_ARCH_V2 | 			// does NOT
				MMU_FTR_CI_LARGE_PAGE
#define MMU_FTRS_PA6T		MMU_FTRS_DEFAULT_HPTE_ARCH_V2 | \			// does NOT
				MMU_FTR_CI_LARGE_PAGE | MMU_FTR_NO_SLBIE_B


So it's only 970, Cell and Pasemi that *don't* have lockless TLBIE.

And KVM HV doesn't doesn't run on any of those.

So we can just not check for the feature in the KVM HV code.

Am I right?

cheers
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index 78e6a392330f..0221a0f74f07 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -439,6 +439,9 @@  static inline int try_lock_tlbie(unsigned int *lock)
 	unsigned int tmp, old;
 	unsigned int token = LOCK_TOKEN;
 
+	if (mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE))
+		return 1;
+
 	asm volatile("1:lwarx	%1,0,%2\n"
 		     "	cmpwi	cr0,%1,0\n"
 		     "	bne	2f\n"
@@ -452,6 +455,12 @@  static inline int try_lock_tlbie(unsigned int *lock)
 	return old == 0;
 }
 
+static inline void unlock_tlbie_after_sync(unsigned int *lock)
+{
+	if (mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE))
+		return;
+}
+
 static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues,
 		      long npages, int global, bool need_sync)
 {
@@ -483,7 +492,7 @@  static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues,
 		}
 
 		asm volatile("eieio; tlbsync; ptesync" : : : "memory");
-		kvm->arch.tlbie_lock = 0;
+		unlock_tlbie_after_sync(&kvm->arch.tlbie_lock);
 	} else {
 		if (need_sync)
 			asm volatile("ptesync" : : : "memory");