diff mbox series

[2/4] KVM: PPC: Book3S HV: Map single pages when doing dirty page logging

Message ID 20181212041617.GC22265@blackberry
State Accepted
Headers show
Series KVM: PPC: Book3S HV: Improve live migration of radix guests | expand

Commit Message

Paul Mackerras Dec. 12, 2018, 4:16 a.m. UTC
For radix guests, this makes KVM map guest memory as individual pages
when dirty page logging is enabled for the memslot corresponding to the
guest real address.  Having a separate partition-scoped PTE for each
system page mapped to the guest means that we have a separate dirty
bit for each page, thus making the reported dirty bitmap more accurate.
Without this, if part of guest memory is backed by transparent huge
pages, the dirty status is reported at a 2MB granularity rather than
a 64kB (or 4kB) granularity for that part, causing userspace to have
to transmit more data when migrating the guest.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
---
 arch/powerpc/kvm/book3s_64_mmu_radix.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

Comments

Suraj Jitindar Singh Dec. 12, 2018, 5:18 a.m. UTC | #1
On Wed, 2018-12-12 at 15:16 +1100, Paul Mackerras wrote:
> For radix guests, this makes KVM map guest memory as individual pages
> when dirty page logging is enabled for the memslot corresponding to
> the
> guest real address.  Having a separate partition-scoped PTE for each
> system page mapped to the guest means that we have a separate dirty
> bit for each page, thus making the reported dirty bitmap more
> accurate.
> Without this, if part of guest memory is backed by transparent huge
> pages, the dirty status is reported at a 2MB granularity rather than
> a 64kB (or 4kB) granularity for that part, causing userspace to have
> to transmit more data when migrating the guest.

Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>

> 
> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
> ---
>  arch/powerpc/kvm/book3s_64_mmu_radix.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c
> b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> index d68162e..87ad35e 100644
> --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
> +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> @@ -683,6 +683,7 @@ int kvmppc_book3s_instantiate_page(struct
> kvm_vcpu *vcpu,
>  	pte_t pte, *ptep;
>  	unsigned int shift, level;
>  	int ret;
> +	bool large_enable;
>  
>  	/* used to check for invalidations in progress */
>  	mmu_seq = kvm->mmu_notifier_seq;
> @@ -732,12 +733,15 @@ int kvmppc_book3s_instantiate_page(struct
> kvm_vcpu *vcpu,
>  	pte = *ptep;
>  	local_irq_enable();
>  
> +	/* If we're logging dirty pages, always map single pages */
> +	large_enable = !(memslot->flags & KVM_MEM_LOG_DIRTY_PAGES);
> +
>  	/* Get pte level from shift/size */
> -	if (shift == PUD_SHIFT &&
> +	if (large_enable && shift == PUD_SHIFT &&
>  	    (gpa & (PUD_SIZE - PAGE_SIZE)) ==
>  	    (hva & (PUD_SIZE - PAGE_SIZE))) {
>  		level = 2;
> -	} else if (shift == PMD_SHIFT &&
> +	} else if (large_enable && shift == PMD_SHIFT &&
>  		   (gpa & (PMD_SIZE - PAGE_SIZE)) ==
>  		   (hva & (PMD_SIZE - PAGE_SIZE))) {
>  		level = 1;
David Gibson Dec. 12, 2018, 11:29 p.m. UTC | #2
On Wed, Dec 12, 2018 at 03:16:17PM +1100, Paul Mackerras wrote:
> For radix guests, this makes KVM map guest memory as individual pages
> when dirty page logging is enabled for the memslot corresponding to the
> guest real address.  Having a separate partition-scoped PTE for each
> system page mapped to the guest means that we have a separate dirty
> bit for each page, thus making the reported dirty bitmap more accurate.
> Without this, if part of guest memory is backed by transparent huge
> pages, the dirty status is reported at a 2MB granularity rather than
> a 64kB (or 4kB) granularity for that part, causing userspace to have
> to transmit more data when migrating the guest.
> 
> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> ---
>  arch/powerpc/kvm/book3s_64_mmu_radix.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> index d68162e..87ad35e 100644
> --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
> +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> @@ -683,6 +683,7 @@ int kvmppc_book3s_instantiate_page(struct kvm_vcpu *vcpu,
>  	pte_t pte, *ptep;
>  	unsigned int shift, level;
>  	int ret;
> +	bool large_enable;
>  
>  	/* used to check for invalidations in progress */
>  	mmu_seq = kvm->mmu_notifier_seq;
> @@ -732,12 +733,15 @@ int kvmppc_book3s_instantiate_page(struct kvm_vcpu *vcpu,
>  	pte = *ptep;
>  	local_irq_enable();
>  
> +	/* If we're logging dirty pages, always map single pages */
> +	large_enable = !(memslot->flags & KVM_MEM_LOG_DIRTY_PAGES);
> +
>  	/* Get pte level from shift/size */
> -	if (shift == PUD_SHIFT &&
> +	if (large_enable && shift == PUD_SHIFT &&
>  	    (gpa & (PUD_SIZE - PAGE_SIZE)) ==
>  	    (hva & (PUD_SIZE - PAGE_SIZE))) {
>  		level = 2;
> -	} else if (shift == PMD_SHIFT &&
> +	} else if (large_enable && shift == PMD_SHIFT &&
>  		   (gpa & (PMD_SIZE - PAGE_SIZE)) ==
>  		   (hva & (PMD_SIZE - PAGE_SIZE))) {
>  		level = 1;
diff mbox series

Patch

diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
index d68162e..87ad35e 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -683,6 +683,7 @@  int kvmppc_book3s_instantiate_page(struct kvm_vcpu *vcpu,
 	pte_t pte, *ptep;
 	unsigned int shift, level;
 	int ret;
+	bool large_enable;
 
 	/* used to check for invalidations in progress */
 	mmu_seq = kvm->mmu_notifier_seq;
@@ -732,12 +733,15 @@  int kvmppc_book3s_instantiate_page(struct kvm_vcpu *vcpu,
 	pte = *ptep;
 	local_irq_enable();
 
+	/* If we're logging dirty pages, always map single pages */
+	large_enable = !(memslot->flags & KVM_MEM_LOG_DIRTY_PAGES);
+
 	/* Get pte level from shift/size */
-	if (shift == PUD_SHIFT &&
+	if (large_enable && shift == PUD_SHIFT &&
 	    (gpa & (PUD_SIZE - PAGE_SIZE)) ==
 	    (hva & (PUD_SIZE - PAGE_SIZE))) {
 		level = 2;
-	} else if (shift == PMD_SHIFT &&
+	} else if (large_enable && shift == PMD_SHIFT &&
 		   (gpa & (PMD_SIZE - PAGE_SIZE)) ==
 		   (hva & (PMD_SIZE - PAGE_SIZE))) {
 		level = 1;