diff mbox series

[SRU,F,2/2] KVM: s390: pv: don't allow userspace to set the clock under PV

Message ID 20230216095500.24414-3-roxana.nicolescu@canonical.com
State New
Headers show
Series [SRU,F,1/2] KVM: s390x: fix SCK locking | expand

Commit Message

Roxana Nicolescu Feb. 16, 2023, 9:55 a.m. UTC
From: Nico Boehr <nrb@linux.ibm.com>

BugLink: https://bugs.launchpad.net/bugs/1999882

[ Upstream commit 6973091d1b50ab4042f6a2d495f59e9db3662ab8 ]

When running under PV, the guest's TOD clock is under control of the
ultravisor and the hypervisor isn't allowed to change it. Hence, don't
allow userspace to change the guest's TOD clock by returning
-EOPNOTSUPP.

When userspace changes the guest's TOD clock, KVM updates its
kvm.arch.epoch field and, in addition, the epoch field in all state
descriptions of all VCPUs.

But, under PV, the ultravisor will ignore the epoch field in the state
description and simply overwrite it on next SIE exit with the actual
guest epoch. This leads to KVM having an incorrect view of the guest's
TOD clock: it has updated its internal kvm.arch.epoch field, but the
ultravisor ignores the field in the state description.

Whenever a guest is now waiting for a clock comparator, KVM will
incorrectly calculate the time when the guest should wake up, possibly
causing the guest to sleep for much longer than expected.

With this change, kvm_s390_set_tod() will now take the kvm->lock to be
able to call kvm_s390_pv_is_protected(). Since kvm_s390_set_tod_clock()
also takes kvm->lock, use __kvm_s390_set_tod_clock() instead.

The function kvm_s390_set_tod_clock is now unused, hence remove it.
Update the documentation to indicate the TOD clock attr calls can now
return -EOPNOTSUPP.

Fixes: 0f3035047140 ("KVM: s390: protvirt: Do only reset registers that are accessible")
Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Link: https://lore.kernel.org/r/20221011160712.928239-2-nrb@linux.ibm.com
Message-Id: <20221011160712.928239-2-nrb@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(backported from commmit 6973091d1b50ab4042f6a2d495f59e9db3662ab8)
Signed-off-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
---
 Documentation/virt/kvm/devices/vm.txt |  4 ++++
 arch/s390/kvm/kvm-s390.c              | 26 +++++++++++++++++---------
 arch/s390/kvm/kvm-s390.h              |  1 -
 3 files changed, 21 insertions(+), 10 deletions(-)

Comments

Stefan Bader Feb. 17, 2023, 9:05 a.m. UTC | #1
On 16.02.23 10:55, Roxana Nicolescu wrote:
> From: Nico Boehr <nrb@linux.ibm.com>
> 
> BugLink: https://bugs.launchpad.net/bugs/1999882
> 
> [ Upstream commit 6973091d1b50ab4042f6a2d495f59e9db3662ab8 ]
> 
> When running under PV, the guest's TOD clock is under control of the
> ultravisor and the hypervisor isn't allowed to change it. Hence, don't
> allow userspace to change the guest's TOD clock by returning
> -EOPNOTSUPP.
> 
> When userspace changes the guest's TOD clock, KVM updates its
> kvm.arch.epoch field and, in addition, the epoch field in all state
> descriptions of all VCPUs.
> 
> But, under PV, the ultravisor will ignore the epoch field in the state
> description and simply overwrite it on next SIE exit with the actual
> guest epoch. This leads to KVM having an incorrect view of the guest's
> TOD clock: it has updated its internal kvm.arch.epoch field, but the
> ultravisor ignores the field in the state description.
> 
> Whenever a guest is now waiting for a clock comparator, KVM will
> incorrectly calculate the time when the guest should wake up, possibly
> causing the guest to sleep for much longer than expected.
> 
> With this change, kvm_s390_set_tod() will now take the kvm->lock to be
> able to call kvm_s390_pv_is_protected(). Since kvm_s390_set_tod_clock()
> also takes kvm->lock, use __kvm_s390_set_tod_clock() instead.
> 
> The function kvm_s390_set_tod_clock is now unused, hence remove it.
> Update the documentation to indicate the TOD clock attr calls can now
> return -EOPNOTSUPP.
> 
> Fixes: 0f3035047140 ("KVM: s390: protvirt: Do only reset registers that are accessible")
> Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
> Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
> Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
> Link: https://lore.kernel.org/r/20221011160712.928239-2-nrb@linux.ibm.com
> Message-Id: <20221011160712.928239-2-nrb@linux.ibm.com>
> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
> Signed-off-by: Sasha Levin <sashal@kernel.org>
> (backported from commmit 6973091d1b50ab4042f6a2d495f59e9db3662ab8)

Same here. So I would like to ask you to submit a v2 just with those 
additional notes. The rest, as far as I can see, looks good.

-Stefan

> Signed-off-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
> ---
>   Documentation/virt/kvm/devices/vm.txt |  4 ++++
>   arch/s390/kvm/kvm-s390.c              | 26 +++++++++++++++++---------
>   arch/s390/kvm/kvm-s390.h              |  1 -
>   3 files changed, 21 insertions(+), 10 deletions(-)
> 
> diff --git a/Documentation/virt/kvm/devices/vm.txt b/Documentation/virt/kvm/devices/vm.txt
> index 4ffb82b02468..e8c28c299144 100644
> --- a/Documentation/virt/kvm/devices/vm.txt
> +++ b/Documentation/virt/kvm/devices/vm.txt
> @@ -183,6 +183,7 @@ KVM_S390_VM_TOD_EXT).
>   Parameters: address of a buffer in user space to store the data (u8) to
>   Returns:    -EFAULT if the given address is not accessible from kernel space
>   	    -EINVAL if setting the TOD clock extension to != 0 is not supported
> +        -EOPNOTSUPP for a PV guest (TOD managed by the ultravisor)
>   
>   3.2. ATTRIBUTE: KVM_S390_VM_TOD_LOW
>   
> @@ -191,6 +192,8 @@ the POP (u64).
>   
>   Parameters: address of a buffer in user space to store the data (u64) to
>   Returns:    -EFAULT if the given address is not accessible from kernel space
> +        -EOPNOTSUPP for a PV guest (TOD managed by the ultravisor)
> +
>   
>   3.3. ATTRIBUTE: KVM_S390_VM_TOD_EXT
>   Allows user space to set/get bits 0-63 of the TOD clock register as defined in
> @@ -202,6 +205,7 @@ Parameters: address of a buffer in user space to store the data
>               (kvm_s390_vm_tod_clock) to
>   Returns:    -EFAULT if the given address is not accessible from kernel space
>   	    -EINVAL if setting the TOD clock extension to != 0 is not supported
> +        -EOPNOTSUPP for a PV guest (TOD managed by the ultravisor)
>   
>   4. GROUP: KVM_S390_VM_CRYPTO
>   Architectures: s390
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 0f1b0dde0de3..f505999708bd 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -1096,6 +1096,8 @@ static int kvm_s390_vm_get_migration(struct kvm *kvm,
>   	return 0;
>   }
>   
> +static void __kvm_s390_set_tod_clock(struct kvm *kvm, const struct kvm_s390_vm_tod_clock *gtod);
> +
>   static int kvm_s390_set_tod_ext(struct kvm *kvm, struct kvm_device_attr *attr)
>   {
>   	struct kvm_s390_vm_tod_clock gtod;
> @@ -1105,7 +1107,7 @@ static int kvm_s390_set_tod_ext(struct kvm *kvm, struct kvm_device_attr *attr)
>   
>   	if (!test_kvm_facility(kvm, 139) && gtod.epoch_idx)
>   		return -EINVAL;
> -	kvm_s390_set_tod_clock(kvm, &gtod);
> +	__kvm_s390_set_tod_clock(kvm, &gtod);
>   
>   	VM_EVENT(kvm, 3, "SET: TOD extension: 0x%x, TOD base: 0x%llx",
>   		gtod.epoch_idx, gtod.tod);
> @@ -1136,7 +1138,7 @@ static int kvm_s390_set_tod_low(struct kvm *kvm, struct kvm_device_attr *attr)
>   			   sizeof(gtod.tod)))
>   		return -EFAULT;
>   
> -	kvm_s390_set_tod_clock(kvm, &gtod);
> +	__kvm_s390_set_tod_clock(kvm, &gtod);
>   	VM_EVENT(kvm, 3, "SET: TOD base: 0x%llx", gtod.tod);
>   	return 0;
>   }
> @@ -1148,6 +1150,16 @@ static int kvm_s390_set_tod(struct kvm *kvm, struct kvm_device_attr *attr)
>   	if (attr->flags)
>   		return -EINVAL;
>   
> +	mutex_lock(&kvm->lock);
> +	/*
> +	 * For protected guests, the TOD is managed by the ultravisor, so trying
> +	 * to change it will never bring the expected results.
> +	 */
> +	if (kvm_s390_pv_is_protected(kvm)) {
> +		ret = -EOPNOTSUPP;
> +		goto out_unlock;
> +	}
> +
>   	switch (attr->attr) {
>   	case KVM_S390_VM_TOD_EXT:
>   		ret = kvm_s390_set_tod_ext(kvm, attr);
> @@ -1162,6 +1174,9 @@ static int kvm_s390_set_tod(struct kvm *kvm, struct kvm_device_attr *attr)
>   		ret = -ENXIO;
>   		break;
>   	}
> +
> +out_unlock:
> +	mutex_unlock(&kvm->lock);
>   	return ret;
>   }
>   
> @@ -3968,13 +3983,6 @@ static void __kvm_s390_set_tod_clock(struct kvm *kvm, const struct kvm_s390_vm_t
>   	preempt_enable();
>   }
>   
> -void kvm_s390_set_tod_clock(struct kvm *kvm, const struct kvm_s390_vm_tod_clock *gtod)
> -{
> -	mutex_lock(&kvm->lock);
> -	__kvm_s390_set_tod_clock(kvm, gtod);
> -	mutex_unlock(&kvm->lock);
> -}
> -
>   int kvm_s390_try_set_tod_clock(struct kvm *kvm, const struct kvm_s390_vm_tod_clock *gtod)
>   {
>   	if (!mutex_trylock(&kvm->lock))
> diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
> index c94c1e29eeca..487b2bd53599 100644
> --- a/arch/s390/kvm/kvm-s390.h
> +++ b/arch/s390/kvm/kvm-s390.h
> @@ -335,7 +335,6 @@ int kvm_s390_handle_sigp(struct kvm_vcpu *vcpu);
>   int kvm_s390_handle_sigp_pei(struct kvm_vcpu *vcpu);
>   
>   /* implemented in kvm-s390.c */
> -void kvm_s390_set_tod_clock(struct kvm *kvm, const struct kvm_s390_vm_tod_clock *gtod);
>   int kvm_s390_try_set_tod_clock(struct kvm *kvm, const struct kvm_s390_vm_tod_clock *gtod);
>   long kvm_arch_fault_in_page(struct kvm_vcpu *vcpu, gpa_t gpa, int writable);
>   int kvm_s390_store_status_unloaded(struct kvm_vcpu *vcpu, unsigned long addr);
diff mbox series

Patch

diff --git a/Documentation/virt/kvm/devices/vm.txt b/Documentation/virt/kvm/devices/vm.txt
index 4ffb82b02468..e8c28c299144 100644
--- a/Documentation/virt/kvm/devices/vm.txt
+++ b/Documentation/virt/kvm/devices/vm.txt
@@ -183,6 +183,7 @@  KVM_S390_VM_TOD_EXT).
 Parameters: address of a buffer in user space to store the data (u8) to
 Returns:    -EFAULT if the given address is not accessible from kernel space
 	    -EINVAL if setting the TOD clock extension to != 0 is not supported
+        -EOPNOTSUPP for a PV guest (TOD managed by the ultravisor)
 
 3.2. ATTRIBUTE: KVM_S390_VM_TOD_LOW
 
@@ -191,6 +192,8 @@  the POP (u64).
 
 Parameters: address of a buffer in user space to store the data (u64) to
 Returns:    -EFAULT if the given address is not accessible from kernel space
+        -EOPNOTSUPP for a PV guest (TOD managed by the ultravisor)
+
 
 3.3. ATTRIBUTE: KVM_S390_VM_TOD_EXT
 Allows user space to set/get bits 0-63 of the TOD clock register as defined in
@@ -202,6 +205,7 @@  Parameters: address of a buffer in user space to store the data
             (kvm_s390_vm_tod_clock) to
 Returns:    -EFAULT if the given address is not accessible from kernel space
 	    -EINVAL if setting the TOD clock extension to != 0 is not supported
+        -EOPNOTSUPP for a PV guest (TOD managed by the ultravisor)
 
 4. GROUP: KVM_S390_VM_CRYPTO
 Architectures: s390
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 0f1b0dde0de3..f505999708bd 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -1096,6 +1096,8 @@  static int kvm_s390_vm_get_migration(struct kvm *kvm,
 	return 0;
 }
 
+static void __kvm_s390_set_tod_clock(struct kvm *kvm, const struct kvm_s390_vm_tod_clock *gtod);
+
 static int kvm_s390_set_tod_ext(struct kvm *kvm, struct kvm_device_attr *attr)
 {
 	struct kvm_s390_vm_tod_clock gtod;
@@ -1105,7 +1107,7 @@  static int kvm_s390_set_tod_ext(struct kvm *kvm, struct kvm_device_attr *attr)
 
 	if (!test_kvm_facility(kvm, 139) && gtod.epoch_idx)
 		return -EINVAL;
-	kvm_s390_set_tod_clock(kvm, &gtod);
+	__kvm_s390_set_tod_clock(kvm, &gtod);
 
 	VM_EVENT(kvm, 3, "SET: TOD extension: 0x%x, TOD base: 0x%llx",
 		gtod.epoch_idx, gtod.tod);
@@ -1136,7 +1138,7 @@  static int kvm_s390_set_tod_low(struct kvm *kvm, struct kvm_device_attr *attr)
 			   sizeof(gtod.tod)))
 		return -EFAULT;
 
-	kvm_s390_set_tod_clock(kvm, &gtod);
+	__kvm_s390_set_tod_clock(kvm, &gtod);
 	VM_EVENT(kvm, 3, "SET: TOD base: 0x%llx", gtod.tod);
 	return 0;
 }
@@ -1148,6 +1150,16 @@  static int kvm_s390_set_tod(struct kvm *kvm, struct kvm_device_attr *attr)
 	if (attr->flags)
 		return -EINVAL;
 
+	mutex_lock(&kvm->lock);
+	/*
+	 * For protected guests, the TOD is managed by the ultravisor, so trying
+	 * to change it will never bring the expected results.
+	 */
+	if (kvm_s390_pv_is_protected(kvm)) {
+		ret = -EOPNOTSUPP;
+		goto out_unlock;
+	}
+
 	switch (attr->attr) {
 	case KVM_S390_VM_TOD_EXT:
 		ret = kvm_s390_set_tod_ext(kvm, attr);
@@ -1162,6 +1174,9 @@  static int kvm_s390_set_tod(struct kvm *kvm, struct kvm_device_attr *attr)
 		ret = -ENXIO;
 		break;
 	}
+
+out_unlock:
+	mutex_unlock(&kvm->lock);
 	return ret;
 }
 
@@ -3968,13 +3983,6 @@  static void __kvm_s390_set_tod_clock(struct kvm *kvm, const struct kvm_s390_vm_t
 	preempt_enable();
 }
 
-void kvm_s390_set_tod_clock(struct kvm *kvm, const struct kvm_s390_vm_tod_clock *gtod)
-{
-	mutex_lock(&kvm->lock);
-	__kvm_s390_set_tod_clock(kvm, gtod);
-	mutex_unlock(&kvm->lock);
-}
-
 int kvm_s390_try_set_tod_clock(struct kvm *kvm, const struct kvm_s390_vm_tod_clock *gtod)
 {
 	if (!mutex_trylock(&kvm->lock))
diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
index c94c1e29eeca..487b2bd53599 100644
--- a/arch/s390/kvm/kvm-s390.h
+++ b/arch/s390/kvm/kvm-s390.h
@@ -335,7 +335,6 @@  int kvm_s390_handle_sigp(struct kvm_vcpu *vcpu);
 int kvm_s390_handle_sigp_pei(struct kvm_vcpu *vcpu);
 
 /* implemented in kvm-s390.c */
-void kvm_s390_set_tod_clock(struct kvm *kvm, const struct kvm_s390_vm_tod_clock *gtod);
 int kvm_s390_try_set_tod_clock(struct kvm *kvm, const struct kvm_s390_vm_tod_clock *gtod);
 long kvm_arch_fault_in_page(struct kvm_vcpu *vcpu, gpa_t gpa, int writable);
 int kvm_s390_store_status_unloaded(struct kvm_vcpu *vcpu, unsigned long addr);