diff mbox series

KVM: PPC: Book3S HV: Avoid crash from THP collapse during radix page fault

Message ID 20181004045351.GC16300@fergus
State Accepted
Headers show
Series KVM: PPC: Book3S HV: Avoid crash from THP collapse during radix page fault | expand

Commit Message

Paul Mackerras Oct. 4, 2018, 4:53 a.m. UTC
Commit 71d29f43b633 ("KVM: PPC: Book3S HV: Don't use compound_order to
determine host mapping size", 2018-09-11) added a call to
__find_linux_pte() and a dereference of the returned PTE pointer to the
radix page fault path in the common case where the page is normal
system memory.  Previously, __find_linux_pte() was only called for
mappings to physical addresses which don't have a page struct (e.g.
memory-mapped I/O) or where the page struct is marked as reserved
memory.

This exposes us to the possibility that the returned PTE pointer
could be NULL, for example in the case of a concurrent THP collapse
operation.  Dereferencing the returned NULL pointer causes a host
crash.

To fix this, we check for NULL, and if it is NULL, we retry the
operation by returning to the guest, with the expectation that it
will generate the same page fault again (unless of course it has
been fixed up by another CPU in the meantime).

Fixes: 71d29f43b633 ("KVM: PPC: Book3S HV: Don't use compound_order to determine host mapping size")
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
---
 arch/powerpc/kvm/book3s_64_mmu_radix.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

Comments

Nicholas Piggin Oct. 4, 2018, 1:14 p.m. UTC | #1
On Thu, 4 Oct 2018 14:53:51 +1000
Paul Mackerras <paulus@ozlabs.org> wrote:

> Commit 71d29f43b633 ("KVM: PPC: Book3S HV: Don't use compound_order to
> determine host mapping size", 2018-09-11) added a call to
> __find_linux_pte() and a dereference of the returned PTE pointer to the
> radix page fault path in the common case where the page is normal
> system memory.  Previously, __find_linux_pte() was only called for
> mappings to physical addresses which don't have a page struct (e.g.
> memory-mapped I/O) or where the page struct is marked as reserved
> memory.
> 
> This exposes us to the possibility that the returned PTE pointer
> could be NULL, for example in the case of a concurrent THP collapse
> operation.  Dereferencing the returned NULL pointer causes a host
> crash.
> 
> To fix this, we check for NULL, and if it is NULL, we retry the
> operation by returning to the guest, with the expectation that it
> will generate the same page fault again (unless of course it has
> been fixed up by another CPU in the meantime).
> 
> Fixes: 71d29f43b633 ("KVM: PPC: Book3S HV: Don't use compound_order to determine host mapping size")
> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>

This seems like a reasonable fix.

Thanks,
Nick

> ---
>  arch/powerpc/kvm/book3s_64_mmu_radix.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> index 933c574e1cf7..998f8d089ac7 100644
> --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
> +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> @@ -646,6 +646,16 @@ int kvmppc_book3s_radix_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu,
>  	 */
>  	local_irq_disable();
>  	ptep = __find_linux_pte(vcpu->arch.pgdir, hva, NULL, &shift);
> +	/*
> +	 * If the PTE disappeared temporarily due to a THP
> +	 * collapse, just return and let the guest try again.
> +	 */
> +	if (!ptep) {
> +		local_irq_enable();
> +		if (page)
> +			put_page(page);
> +		return RESUME_GUEST;
> +	}
>  	pte = *ptep;
>  	local_irq_enable();
>  
> -- 
> 2.11.0
diff mbox series

Patch

diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
index 933c574e1cf7..998f8d089ac7 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -646,6 +646,16 @@  int kvmppc_book3s_radix_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu,
 	 */
 	local_irq_disable();
 	ptep = __find_linux_pte(vcpu->arch.pgdir, hva, NULL, &shift);
+	/*
+	 * If the PTE disappeared temporarily due to a THP
+	 * collapse, just return and let the guest try again.
+	 */
+	if (!ptep) {
+		local_irq_enable();
+		if (page)
+			put_page(page);
+		return RESUME_GUEST;
+	}
 	pte = *ptep;
 	local_irq_enable();