mbox series

[v11,0/7] KVM: PPC: Driver to manage pages of secure guest

Message ID 20191125030631.7716-1-bharata@linux.ibm.com
Headers show
Series KVM: PPC: Driver to manage pages of secure guest | expand

Message

Bharata B Rao Nov. 25, 2019, 3:06 a.m. UTC
Hi,

This is the next version of the patchset that adds required support
in the KVM hypervisor to run secure guests on PEF-enabled POWER platforms.

This version includes the following changes:

- Ensure that any malicious calls to the 4 hcalls (init_start, init_done,
  page_in and page_out) are handled safely by returning appropriate
  errors (Paul Mackerras)
- init_start hcall should work for only radix guests.
- Fix the page-size-order argument in uv_page_inval (Ram Pai)
- Don't free up partition scoped page tables in HV when guest
  becomes secure (Paul Mackerras)
- During guest reset, when we unpin VPA pages, make sure that no vcpu
  is running and fail the SVM_OFF ioctl if any are running (Paul Mackerras)
- Dropped the patch that implemented init_abort hcall as it still has
  unresolved questions.

Anshuman Khandual (1):
  KVM: PPC: Ultravisor: Add PPC_UV config option

Bharata B Rao (6):
  mm: ksm: Export ksm_madvise()
  KVM: PPC: Support for running secure guests
  KVM: PPC: Shared pages support for secure guests
  KVM: PPC: Radix changes for secure guest
  KVM: PPC: Handle memory plug/unplug to secure VM
  KVM: PPC: Support reset of secure guest

 Documentation/virt/kvm/api.txt              |  18 +
 arch/powerpc/Kconfig                        |  17 +
 arch/powerpc/include/asm/hvcall.h           |   9 +
 arch/powerpc/include/asm/kvm_book3s_uvmem.h |  74 ++
 arch/powerpc/include/asm/kvm_host.h         |   6 +
 arch/powerpc/include/asm/kvm_ppc.h          |   1 +
 arch/powerpc/include/asm/ultravisor-api.h   |   6 +
 arch/powerpc/include/asm/ultravisor.h       |  36 +
 arch/powerpc/kvm/Makefile                   |   3 +
 arch/powerpc/kvm/book3s_64_mmu_radix.c      |  25 +
 arch/powerpc/kvm/book3s_hv.c                | 143 ++++
 arch/powerpc/kvm/book3s_hv_uvmem.c          | 774 ++++++++++++++++++++
 arch/powerpc/kvm/powerpc.c                  |  12 +
 include/uapi/linux/kvm.h                    |   1 +
 mm/ksm.c                                    |   1 +
 15 files changed, 1126 insertions(+)
 create mode 100644 arch/powerpc/include/asm/kvm_book3s_uvmem.h
 create mode 100644 arch/powerpc/kvm/book3s_hv_uvmem.c

Comments

Bharata B Rao Nov. 28, 2019, 5:04 a.m. UTC | #1
On Mon, Nov 25, 2019 at 08:36:24AM +0530, Bharata B Rao wrote:
> Hi,
> 
> This is the next version of the patchset that adds required support
> in the KVM hypervisor to run secure guests on PEF-enabled POWER platforms.
> 

Here is a fix for the issue Hugh identified with the usage of ksm_madvise()
in this patchset. It applies on top of this patchset.
----

From 8a4d769bf4c61f921c79ce68923be3c403bd5862 Mon Sep 17 00:00:00 2001
From: Bharata B Rao <bharata@linux.ibm.com>
Date: Thu, 28 Nov 2019 09:31:54 +0530
Subject: [PATCH 1/1] KVM: PPC: Book3S HV: Take write mmap_sem when calling
 ksm_madvise

In order to prevent the device private pages (that correspond to
pages of secure guest) from participating in KSM merging, H_SVM_PAGE_IN
calls ksm_madvise() under read version of mmap_sem. However ksm_madvise()
needs to be under write lock, fix this.

Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
---
 arch/powerpc/kvm/book3s_hv_uvmem.c | 29 ++++++++++++++++++++---------
 1 file changed, 20 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c
index f24ac3cfb34c..2de264fc3156 100644
--- a/arch/powerpc/kvm/book3s_hv_uvmem.c
+++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
@@ -46,11 +46,10 @@
  *
  * Locking order
  *
- * 1. srcu_read_lock(&kvm->srcu) - Protects KVM memslots
- * 2. down_read(&kvm->mm->mmap_sem) - find_vma, migrate_vma_pages and helpers
- * 3. mutex_lock(&kvm->arch.uvmem_lock) - protects read/writes to uvmem slots
- *					  thus acting as sync-points
- *					  for page-in/out
+ * 1. kvm->srcu - Protects KVM memslots
+ * 2. kvm->mm->mmap_sem - find_vma, migrate_vma_pages and helpers, ksm_madvise
+ * 3. kvm->arch.uvmem_lock - protects read/writes to uvmem slots thus acting
+ *			     as sync-points for page-in/out
  */
 
 /*
@@ -344,7 +343,7 @@ static struct page *kvmppc_uvmem_get_page(unsigned long gpa, struct kvm *kvm)
 static int
 kvmppc_svm_page_in(struct vm_area_struct *vma, unsigned long start,
 		   unsigned long end, unsigned long gpa, struct kvm *kvm,
-		   unsigned long page_shift)
+		   unsigned long page_shift, bool *downgrade)
 {
 	unsigned long src_pfn, dst_pfn = 0;
 	struct migrate_vma mig;
@@ -360,8 +359,15 @@ kvmppc_svm_page_in(struct vm_area_struct *vma, unsigned long start,
 	mig.src = &src_pfn;
 	mig.dst = &dst_pfn;
 
+	/*
+	 * We come here with mmap_sem write lock held just for
+	 * ksm_madvise(), otherwise we only need read mmap_sem.
+	 * Hence downgrade to read lock once ksm_madvise() is done.
+	 */
 	ret = ksm_madvise(vma, vma->vm_start, vma->vm_end,
 			  MADV_UNMERGEABLE, &vma->vm_flags);
+	downgrade_write(&kvm->mm->mmap_sem);
+	*downgrade = true;
 	if (ret)
 		return ret;
 
@@ -456,6 +462,7 @@ unsigned long
 kvmppc_h_svm_page_in(struct kvm *kvm, unsigned long gpa,
 		     unsigned long flags, unsigned long page_shift)
 {
+	bool downgrade = false;
 	unsigned long start, end;
 	struct vm_area_struct *vma;
 	int srcu_idx;
@@ -476,7 +483,7 @@ kvmppc_h_svm_page_in(struct kvm *kvm, unsigned long gpa,
 
 	ret = H_PARAMETER;
 	srcu_idx = srcu_read_lock(&kvm->srcu);
-	down_read(&kvm->mm->mmap_sem);
+	down_write(&kvm->mm->mmap_sem);
 
 	start = gfn_to_hva(kvm, gfn);
 	if (kvm_is_error_hva(start))
@@ -492,12 +499,16 @@ kvmppc_h_svm_page_in(struct kvm *kvm, unsigned long gpa,
 	if (!vma || vma->vm_start > start || vma->vm_end < end)
 		goto out_unlock;
 
-	if (!kvmppc_svm_page_in(vma, start, end, gpa, kvm, page_shift))
+	if (!kvmppc_svm_page_in(vma, start, end, gpa, kvm, page_shift,
+				&downgrade))
 		ret = H_SUCCESS;
 out_unlock:
 	mutex_unlock(&kvm->arch.uvmem_lock);
 out:
-	up_read(&kvm->mm->mmap_sem);
+	if (downgrade)
+		up_read(&kvm->mm->mmap_sem);
+	else
+		up_write(&kvm->mm->mmap_sem);
 	srcu_read_unlock(&kvm->srcu, srcu_idx);
 	return ret;
 }
Hugh Dickins Dec. 1, 2019, 8:24 p.m. UTC | #2
On Thu, 28 Nov 2019, Bharata B Rao wrote:
> On Mon, Nov 25, 2019 at 08:36:24AM +0530, Bharata B Rao wrote:
> > Hi,
> > 
> > This is the next version of the patchset that adds required support
> > in the KVM hypervisor to run secure guests on PEF-enabled POWER platforms.
> > 
> 
> Here is a fix for the issue Hugh identified with the usage of ksm_madvise()
> in this patchset. It applies on top of this patchset.

It looks correct to me, and I hope will not spoil your performance in any
way that matters.  But I have to say, the patch would be so much clearer,
if you just named your bool "downgraded" instead of "downgrade".

Hugh

> ----
> 
> From 8a4d769bf4c61f921c79ce68923be3c403bd5862 Mon Sep 17 00:00:00 2001
> From: Bharata B Rao <bharata@linux.ibm.com>
> Date: Thu, 28 Nov 2019 09:31:54 +0530
> Subject: [PATCH 1/1] KVM: PPC: Book3S HV: Take write mmap_sem when calling
>  ksm_madvise
> 
> In order to prevent the device private pages (that correspond to
> pages of secure guest) from participating in KSM merging, H_SVM_PAGE_IN
> calls ksm_madvise() under read version of mmap_sem. However ksm_madvise()
> needs to be under write lock, fix this.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
> ---
>  arch/powerpc/kvm/book3s_hv_uvmem.c | 29 ++++++++++++++++++++---------
>  1 file changed, 20 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c
> index f24ac3cfb34c..2de264fc3156 100644
> --- a/arch/powerpc/kvm/book3s_hv_uvmem.c
> +++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
> @@ -46,11 +46,10 @@
>   *
>   * Locking order
>   *
> - * 1. srcu_read_lock(&kvm->srcu) - Protects KVM memslots
> - * 2. down_read(&kvm->mm->mmap_sem) - find_vma, migrate_vma_pages and helpers
> - * 3. mutex_lock(&kvm->arch.uvmem_lock) - protects read/writes to uvmem slots
> - *					  thus acting as sync-points
> - *					  for page-in/out
> + * 1. kvm->srcu - Protects KVM memslots
> + * 2. kvm->mm->mmap_sem - find_vma, migrate_vma_pages and helpers, ksm_madvise
> + * 3. kvm->arch.uvmem_lock - protects read/writes to uvmem slots thus acting
> + *			     as sync-points for page-in/out
>   */
>  
>  /*
> @@ -344,7 +343,7 @@ static struct page *kvmppc_uvmem_get_page(unsigned long gpa, struct kvm *kvm)
>  static int
>  kvmppc_svm_page_in(struct vm_area_struct *vma, unsigned long start,
>  		   unsigned long end, unsigned long gpa, struct kvm *kvm,
> -		   unsigned long page_shift)
> +		   unsigned long page_shift, bool *downgrade)
>  {
>  	unsigned long src_pfn, dst_pfn = 0;
>  	struct migrate_vma mig;
> @@ -360,8 +359,15 @@ kvmppc_svm_page_in(struct vm_area_struct *vma, unsigned long start,
>  	mig.src = &src_pfn;
>  	mig.dst = &dst_pfn;
>  
> +	/*
> +	 * We come here with mmap_sem write lock held just for
> +	 * ksm_madvise(), otherwise we only need read mmap_sem.
> +	 * Hence downgrade to read lock once ksm_madvise() is done.
> +	 */
>  	ret = ksm_madvise(vma, vma->vm_start, vma->vm_end,
>  			  MADV_UNMERGEABLE, &vma->vm_flags);
> +	downgrade_write(&kvm->mm->mmap_sem);
> +	*downgrade = true;
>  	if (ret)
>  		return ret;
>  
> @@ -456,6 +462,7 @@ unsigned long
>  kvmppc_h_svm_page_in(struct kvm *kvm, unsigned long gpa,
>  		     unsigned long flags, unsigned long page_shift)
>  {
> +	bool downgrade = false;
>  	unsigned long start, end;
>  	struct vm_area_struct *vma;
>  	int srcu_idx;
> @@ -476,7 +483,7 @@ kvmppc_h_svm_page_in(struct kvm *kvm, unsigned long gpa,
>  
>  	ret = H_PARAMETER;
>  	srcu_idx = srcu_read_lock(&kvm->srcu);
> -	down_read(&kvm->mm->mmap_sem);
> +	down_write(&kvm->mm->mmap_sem);
>  
>  	start = gfn_to_hva(kvm, gfn);
>  	if (kvm_is_error_hva(start))
> @@ -492,12 +499,16 @@ kvmppc_h_svm_page_in(struct kvm *kvm, unsigned long gpa,
>  	if (!vma || vma->vm_start > start || vma->vm_end < end)
>  		goto out_unlock;
>  
> -	if (!kvmppc_svm_page_in(vma, start, end, gpa, kvm, page_shift))
> +	if (!kvmppc_svm_page_in(vma, start, end, gpa, kvm, page_shift,
> +				&downgrade))
>  		ret = H_SUCCESS;
>  out_unlock:
>  	mutex_unlock(&kvm->arch.uvmem_lock);
>  out:
> -	up_read(&kvm->mm->mmap_sem);
> +	if (downgrade)
> +		up_read(&kvm->mm->mmap_sem);
> +	else
> +		up_write(&kvm->mm->mmap_sem);
>  	srcu_read_unlock(&kvm->srcu, srcu_idx);
>  	return ret;
>  }
> -- 
> 2.21.0
Bharata B Rao Dec. 3, 2019, 9:44 a.m. UTC | #3
On Sun, Dec 01, 2019 at 12:24:50PM -0800, Hugh Dickins wrote:
> On Thu, 28 Nov 2019, Bharata B Rao wrote:
> > On Mon, Nov 25, 2019 at 08:36:24AM +0530, Bharata B Rao wrote:
> > > Hi,
> > > 
> > > This is the next version of the patchset that adds required support
> > > in the KVM hypervisor to run secure guests on PEF-enabled POWER platforms.
> > > 
> > 
> > Here is a fix for the issue Hugh identified with the usage of ksm_madvise()
> > in this patchset. It applies on top of this patchset.
> 
> It looks correct to me, and I hope will not spoil your performance in any
> way that matters.  But I have to say, the patch would be so much clearer,
> if you just named your bool "downgraded" instead of "downgrade".

Thanks for confirming. Yes "downgraded" would have been more
appropriate, will probably change it when we do any next change in this
part of the code.

Regards,
Bharata.