[12/26] KVM: track whether guest state is encrypted

Message ID	20240322181116.1228416-13-pbonzini@redhat.com
State	New
Headers	show Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org> From: Paolo Bonzini <pbonzini@redhat.com> To: qemu-devel@nongnu.org Cc: xiaoyao.li@intel.com, michael.roth@amd.com, david@redhat.com Subject: [PATCH 12/26] KVM: track whether guest state is encrypted Date: Fri, 22 Mar 2024 19:11:02 +0100 Message-ID: <20240322181116.1228416-13-pbonzini@redhat.com> In-Reply-To: <20240322181116.1228416-1-pbonzini@redhat.com> References: <20240322181116.1228416-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=170.10.133.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -22 X-Spam_score: -2.3 X-Spam_bar: -- X-Spam_report: (-2.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.222, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action Precedence: list Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org
Series	x86, kvm: common confidential computing subset \| expand [for-9.1,00/26] x86, kvm: common confidential computing subset [01/26] pci-host/q35: Move PAM initialization above SMRAM initialization [02/26] q35: Introduce smm_ranges property for q35-pci-host [03/26] confidential guest support: Add kvm_init() and kvm_reset() in class [04/26] i386/sev: Switch to use confidential_guest_kvm_init() [05/26] ppc/pef: switch to use confidential_guest_kvm_init/reset() [06/26] s390: Switch to use confidential_guest_kvm_init() [07/26] scripts/update-linux-headers: Add setup_data.h to import list [08/26] scripts/update-linux-headers: Add bits.h to file imports [09/26,HACK] linux-headers: Update headers for 6.8 + kvm-coco-queue + SNP [10/26,TO,SQUASH] hw/i386: Remove redeclaration of struct setup_data [11/26] runstate: skip initial CPU reset if reset is not actually possible [12/26] KVM: track whether guest state is encrypted [13/26] KVM: remove kvm_arch_cpu_check_are_resettable [14/26] target/i386: introduce x86-confidential-guest [15/26] target/i386: Implement mc->kvm_type() to get VM type [16/26] target/i386: SEV: use KVM_SEV_INIT2 if possible [17/26] trace/kvm: Split address space and slot id in trace_kvm_set_user_memory() [18/26] kvm: Introduce support for memory_attributes [19/26] RAMBlock: Add support of KVM private guest memfd [20/26] kvm: Enable KVM_SET_USER_MEMORY_REGION2 for memslot [21/26] kvm/memory: Make memory type private by default if it has guest memfd backend [22/26] HostMem: Add mechanism to opt in kvm guest memfd via MachineState [23/26] RAMBlock: make guest_memfd require uncoordinated discard [24/26] physmem: Introduce ram_block_discard_guest_memfd_range() [25/26] kvm: handle KVM_EXIT_MEMORY_FAULT [26/26] i386/kvm: Move architectural CPUID leaf generation to separate helper

Message ID

20240322181116.1228416-13-pbonzini@redhat.com

State

New

Headers

From: Paolo Bonzini <pbonzini@redhat.com>
To: qemu-devel@nongnu.org
Cc: xiaoyao.li@intel.com,
	michael.roth@amd.com,
	david@redhat.com
Subject: [PATCH 12/26] KVM: track whether guest state is encrypted
Date: Fri, 22 Mar 2024 19:11:02 +0100
Message-ID: <20240322181116.1228416-13-pbonzini@redhat.com>
In-Reply-To: <20240322181116.1228416-1-pbonzini@redhat.com>
References: <20240322181116.1228416-1-pbonzini@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Received-SPF: pass client-ip=170.10.133.124;
 envelope-from=pbonzini@redhat.com;
 helo=us-smtp-delivery-124.mimecast.com
X-Spam_score_int: -22
X-Spam_score: -2.3
X-Spam_bar: --
X-Spam_report: (-2.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.222,
 DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1,
 RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001,
 SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <https://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org
Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org

Series

x86, kvm: common confidential computing subset | expand

Commit Message

Paolo Bonzini March 22, 2024, 6:11 p.m. UTC

So far, KVM has allowed KVM_GET/SET_* ioctls to execute even if the
guest state is encrypted, in which case they do nothing.  For the new
API using VM types, instead, the ioctls will fail which is a safer and
more robust approach.

The new API will be the only one available for SEV-SNP and TDX, but it
is also usable for SEV and SEV-ES.  In preparation for that, require
architecture-specific KVM code to communicate the point at which guest
state is protected (which must be after kvm_cpu_synchronize_post_init(),
though that might change in the future in order to suppor migration).
From that point, skip reading registers so that cpu->vcpu_dirty is
never true: if it ever becomes true, kvm_arch_put_registers() will
fail miserably.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/sysemu/kvm.h     |  2 ++
 include/sysemu/kvm_int.h |  1 +
 accel/kvm/kvm-all.c      | 14 ++++++++++++--
 target/i386/sev.c        |  1 +
 4 files changed, 16 insertions(+), 2 deletions(-)

Comments

Philippe Mathieu-Daudé March 25, 2024, 9:25 a.m. UTC | #1

On 22/3/24 19:11, Paolo Bonzini wrote:
> So far, KVM has allowed KVM_GET/SET_* ioctls to execute even if the
> guest state is encrypted, in which case they do nothing.  For the new
> API using VM types, instead, the ioctls will fail which is a safer and
> more robust approach.
> 
> The new API will be the only one available for SEV-SNP and TDX, but it
> is also usable for SEV and SEV-ES.  In preparation for that, require
> architecture-specific KVM code to communicate the point at which guest
> state is protected (which must be after kvm_cpu_synchronize_post_init(),
> though that might change in the future in order to suppor migration).
>  From that point, skip reading registers so that cpu->vcpu_dirty is
> never true: if it ever becomes true, kvm_arch_put_registers() will
> fail miserably.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   include/sysemu/kvm.h     |  2 ++
>   include/sysemu/kvm_int.h |  1 +
>   accel/kvm/kvm-all.c      | 14 ++++++++++++--
>   target/i386/sev.c        |  1 +
>   4 files changed, 16 insertions(+), 2 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>

Xiaoyao Li March 26, 2024, 3:48 p.m. UTC | #2

On 3/23/2024 2:11 AM, Paolo Bonzini wrote:
> So far, KVM has allowed KVM_GET/SET_* ioctls to execute even if the
> guest state is encrypted, in which case they do nothing.  For the new
> API using VM types, instead, the ioctls will fail which is a safer and
> more robust approach.
> 
> The new API will be the only one available for SEV-SNP and TDX, but it
> is also usable for SEV and SEV-ES.  In preparation for that, require
> architecture-specific KVM code to communicate the point at which guest
> state is protected (which must be after kvm_cpu_synchronize_post_init(),
> though that might change in the future in order to suppor migration).
>  From that point, skip reading registers so that cpu->vcpu_dirty is
> never true: if it ever becomes true, kvm_arch_put_registers() will
> fail miserably.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   include/sysemu/kvm.h     |  2 ++
>   include/sysemu/kvm_int.h |  1 +
>   accel/kvm/kvm-all.c      | 14 ++++++++++++--
>   target/i386/sev.c        |  1 +
>   4 files changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
> index fad9a7e8ff3..302e8f6f1e5 100644
> --- a/include/sysemu/kvm.h
> +++ b/include/sysemu/kvm.h
> @@ -539,6 +539,8 @@ bool kvm_dirty_ring_enabled(void);
>   
>   uint32_t kvm_dirty_ring_size(void);
>   
> +void kvm_mark_guest_state_protected(void);
> +
>   /**
>    * kvm_hwpoisoned_mem - indicate if there is any hwpoisoned page
>    * reported for the VM.
> diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
> index 882e37e12c5..3496be7997a 100644
> --- a/include/sysemu/kvm_int.h
> +++ b/include/sysemu/kvm_int.h
> @@ -87,6 +87,7 @@ struct KVMState
>       bool kernel_irqchip_required;
>       OnOffAuto kernel_irqchip_split;
>       bool sync_mmu;
> +    bool guest_state_protected;
>       uint64_t manual_dirty_log_protect;
>       /* The man page (and posix) say ioctl numbers are signed int, but
>        * they're not.  Linux, glibc and *BSD all treat ioctl numbers as
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index a8cecd040eb..05fa3533c66 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -2698,7 +2698,7 @@ bool kvm_cpu_check_are_resettable(void)
>   
>   static void do_kvm_cpu_synchronize_state(CPUState *cpu, run_on_cpu_data arg)
>   {
> -    if (!cpu->vcpu_dirty) {
> +    if (!cpu->vcpu_dirty && !kvm_state->guest_state_protected) {
>           int ret = kvm_arch_get_registers(cpu);
>           if (ret) {
>               error_report("Failed to get registers: %s", strerror(-ret));
> @@ -2712,7 +2712,7 @@ static void do_kvm_cpu_synchronize_state(CPUState *cpu, run_on_cpu_data arg)
>   
>   void kvm_cpu_synchronize_state(CPUState *cpu)
>   {
> -    if (!cpu->vcpu_dirty) {
> +    if (!cpu->vcpu_dirty && !kvm_state->guest_state_protected) {
>           run_on_cpu(cpu, do_kvm_cpu_synchronize_state, RUN_ON_CPU_NULL);
>       }
>   }
> @@ -2747,6 +2747,11 @@ static void do_kvm_cpu_synchronize_post_init(CPUState *cpu, run_on_cpu_data arg)
>   
>   void kvm_cpu_synchronize_post_init(CPUState *cpu)
>   {
> +    /*
> +     * This runs before the machine_init_done notifiers, and is the last
> +     * opportunity to synchronize the state of confidential guests.
> +     */ > +    assert(!kvm_state->guest_state_protected);

So, this requires confidential guests to call 
kvm_mark_guest_state_protected() in its machine_init_done notifier callback?

But for TDX, the guest_state is protected at the beginning, not some 
time later when machine_init_done.

>       run_on_cpu(cpu, do_kvm_cpu_synchronize_post_init, RUN_ON_CPU_NULL);
>   }
>   
> @@ -4094,3 +4099,8 @@ void query_stats_schemas_cb(StatsSchemaList **result, Error **errp)
>           query_stats_schema_vcpu(first_cpu, &stats_args);
>       }
>   }
> +
> +void kvm_mark_guest_state_protected(void)
> +{
> +    kvm_state->guest_state_protected = true;
> +}
> diff --git a/target/i386/sev.c b/target/i386/sev.c
> index b8f79d34d19..c49a8fd55eb 100644
> --- a/target/i386/sev.c
> +++ b/target/i386/sev.c
> @@ -755,6 +755,7 @@ sev_launch_get_measure(Notifier *notifier, void *unused)
>           if (ret) {
>               exit(1);
>           }
> +        kvm_mark_guest_state_protected();
>       }
>   
>       /* query the measurement blob length */

Paolo Bonzini March 27, 2024, 9:05 a.m. UTC | #3

On Tue, Mar 26, 2024 at 4:48 PM Xiaoyao Li <xiaoyao.li@intel.com> wrote:
> So, this requires confidential guests to call
> kvm_mark_guest_state_protected() in its machine_init_done notifier callback?
>
> But for TDX, the guest_state is protected at the beginning, not some
> time later when machine_init_done.

Good point, I will change this to an "if".

Paolo

diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index fad9a7e8ff3..302e8f6f1e5 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -539,6 +539,8 @@  bool kvm_dirty_ring_enabled(void);
 
 uint32_t kvm_dirty_ring_size(void);
 
+void kvm_mark_guest_state_protected(void);
+
 /**
  * kvm_hwpoisoned_mem - indicate if there is any hwpoisoned page
  * reported for the VM.
diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
index 882e37e12c5..3496be7997a 100644
--- a/include/sysemu/kvm_int.h
+++ b/include/sysemu/kvm_int.h
@@ -87,6 +87,7 @@  struct KVMState
     bool kernel_irqchip_required;
     OnOffAuto kernel_irqchip_split;
     bool sync_mmu;
+    bool guest_state_protected;
     uint64_t manual_dirty_log_protect;
     /* The man page (and posix) say ioctl numbers are signed int, but
      * they're not.  Linux, glibc and *BSD all treat ioctl numbers as
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index a8cecd040eb..05fa3533c66 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2698,7 +2698,7 @@  bool kvm_cpu_check_are_resettable(void)
 
 static void do_kvm_cpu_synchronize_state(CPUState *cpu, run_on_cpu_data arg)
 {
-    if (!cpu->vcpu_dirty) {
+    if (!cpu->vcpu_dirty && !kvm_state->guest_state_protected) {
         int ret = kvm_arch_get_registers(cpu);
         if (ret) {
             error_report("Failed to get registers: %s", strerror(-ret));
@@ -2712,7 +2712,7 @@  static void do_kvm_cpu_synchronize_state(CPUState *cpu, run_on_cpu_data arg)
 
 void kvm_cpu_synchronize_state(CPUState *cpu)
 {
-    if (!cpu->vcpu_dirty) {
+    if (!cpu->vcpu_dirty && !kvm_state->guest_state_protected) {
         run_on_cpu(cpu, do_kvm_cpu_synchronize_state, RUN_ON_CPU_NULL);
     }
 }
@@ -2747,6 +2747,11 @@  static void do_kvm_cpu_synchronize_post_init(CPUState *cpu, run_on_cpu_data arg)
 
 void kvm_cpu_synchronize_post_init(CPUState *cpu)
 {
+    /*
+     * This runs before the machine_init_done notifiers, and is the last
+     * opportunity to synchronize the state of confidential guests.
+     */
+    assert(!kvm_state->guest_state_protected);
     run_on_cpu(cpu, do_kvm_cpu_synchronize_post_init, RUN_ON_CPU_NULL);
 }
 
@@ -4094,3 +4099,8 @@  void query_stats_schemas_cb(StatsSchemaList **result, Error **errp)
         query_stats_schema_vcpu(first_cpu, &stats_args);
     }
 }
+
+void kvm_mark_guest_state_protected(void)
+{
+    kvm_state->guest_state_protected = true;
+}
diff --git a/target/i386/sev.c b/target/i386/sev.c
index b8f79d34d19..c49a8fd55eb 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -755,6 +755,7 @@  sev_launch_get_measure(Notifier *notifier, void *unused)
         if (ret) {
             exit(1);
         }
+        kvm_mark_guest_state_protected();
     }
 
     /* query the measurement blob length */

[12/26] KVM: track whether guest state is encrypted

Commit Message

Comments

Patch