Patchwork [2/2] kvm: exit to userspace with reason KVM_EXIT_VCPU_DEAD

login
register
mail settings
Submitter Liu Ping Fan
Date Nov. 27, 2011, 2:42 a.m.
Message ID <1322361735-21428-1-git-send-email-kernelfans@gmail.com>
Download mbox | patch
Permalink /patch/127828/
State New
Headers show

Comments

Liu Ping Fan - Nov. 27, 2011, 2:42 a.m.
From: Liu Ping Fan <pingfank@linux.vnet.ibm.com>

The vcpu can be safely released when
--1.guest tells us that the vcpu is not needed any longer.
--2.vcpu hits the last instruction _halt_

If both of the conditions are satisfied, kvm exits to userspace
with the reason vcpu dead. So the user thread can exit safely.

Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
---
 arch/x86/kvm/x86.c       |   16 ++++++++++++++++
 include/linux/kvm.h      |   11 +++++++++++
 include/linux/kvm_host.h |    1 +
 3 files changed, 28 insertions(+), 0 deletions(-)
Avi Kivity - Nov. 27, 2011, 10:36 a.m.
On 11/27/2011 04:42 AM, Liu Ping Fan wrote:
> From: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
>
> The vcpu can be safely released when
> --1.guest tells us that the vcpu is not needed any longer.
> --2.vcpu hits the last instruction _halt_
>
> If both of the conditions are satisfied, kvm exits to userspace
> with the reason vcpu dead. So the user thread can exit safely.
>
>

Seems to be completely unnecessary.  If you want to exit from the vcpu
thread, send it a signal.
Gleb Natapov - Nov. 27, 2011, 10:50 a.m.
On Sun, Nov 27, 2011 at 12:36:55PM +0200, Avi Kivity wrote:
> On 11/27/2011 04:42 AM, Liu Ping Fan wrote:
> > From: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
> >
> > The vcpu can be safely released when
> > --1.guest tells us that the vcpu is not needed any longer.
> > --2.vcpu hits the last instruction _halt_
> >
> > If both of the conditions are satisfied, kvm exits to userspace
> > with the reason vcpu dead. So the user thread can exit safely.
> >
> >
> 
> Seems to be completely unnecessary.  If you want to exit from the vcpu
> thread, send it a signal.
> 
Also if guest "tells us that the vcpu is not needed any longer" (via
ACPI I presume) and vcpu actually doing something critical instead of
sitting in 1:hlt; jmp 1b loop then it is guest's problem if it stops
working after vcpu destruction.

--
			Gleb.
Liu Ping Fan - Nov. 28, 2011, 7:16 a.m.
On Sun, Nov 27, 2011 at 6:50 PM, Gleb Natapov <gleb@redhat.com> wrote:
> On Sun, Nov 27, 2011 at 12:36:55PM +0200, Avi Kivity wrote:
>> On 11/27/2011 04:42 AM, Liu Ping Fan wrote:
>> > From: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
>> >
>> > The vcpu can be safely released when
>> > --1.guest tells us that the vcpu is not needed any longer.
>> > --2.vcpu hits the last instruction _halt_
>> >
>> > If both of the conditions are satisfied, kvm exits to userspace
>> > with the reason vcpu dead. So the user thread can exit safely.
>> >
>> >
>>
>> Seems to be completely unnecessary.  If you want to exit from the vcpu
>> thread, send it a signal.
>>
Hi Avi and Gleb,

First, I wanted to make sure my assumption is right, so I can grab
your meaning more clearly -:). Could you elaborate it for me, thanks.

I had thought that when a vcpu was being removed from guest, kvm must
satisfy the following conditions to safely remove the vcpu:
--1. The tasks on vcpu in GUEST  have already been migrated to other
vcpus and ONLY idle_task left ---- The CPU_DEAD is the checkpoint.
--2. We must wait the idle task to hit native_halt() in GUEST, till
that time, this vcpu is no needed even by idle_task. In KVM, the vcpu
thread will finally sit on "kvm_vcpu_block(vcpu);"
We CAN NOT suppose the sequence of the two condition because they come
from different threads.  Am I right?

And here comes my question,
--1. I think the signal will make vcpu_run exit to user, but is it
allow vcpu thread to finally call  "kernel/exit.c : void do_exit(long
code)" in current code in kvm or in qemu?
--2. If we got CPU_DEAD event, and then send a signal to vcpu thread,
could we ensure that we have already sit on "kvm_vcpu_block(vcpu);"

Thanks and regards,
ping fan

> Also if guest "tells us that the vcpu is not needed any longer" (via
> ACPI I presume) and vcpu actually doing something critical instead of
> sitting in 1:hlt; jmp 1b loop then it is guest's problem if it stops
> working after vcpu destruction.
>


> --
>                        Gleb.
>
Gleb Natapov - Nov. 28, 2011, 8:46 a.m.
On Mon, Nov 28, 2011 at 03:16:01PM +0800, Liu ping fan wrote:
> On Sun, Nov 27, 2011 at 6:50 PM, Gleb Natapov <gleb@redhat.com> wrote:
> > On Sun, Nov 27, 2011 at 12:36:55PM +0200, Avi Kivity wrote:
> >> On 11/27/2011 04:42 AM, Liu Ping Fan wrote:
> >> > From: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
> >> >
> >> > The vcpu can be safely released when
> >> > --1.guest tells us that the vcpu is not needed any longer.
> >> > --2.vcpu hits the last instruction _halt_
> >> >
> >> > If both of the conditions are satisfied, kvm exits to userspace
> >> > with the reason vcpu dead. So the user thread can exit safely.
> >> >
> >> >
> >>
> >> Seems to be completely unnecessary.  If you want to exit from the vcpu
> >> thread, send it a signal.
> >>
> Hi Avi and Gleb,
> 
> First, I wanted to make sure my assumption is right, so I can grab
> your meaning more clearly -:). Could you elaborate it for me, thanks.
> 
> I had thought that when a vcpu was being removed from guest, kvm must
> satisfy the following conditions to safely remove the vcpu:
> --1. The tasks on vcpu in GUEST  have already been migrated to other
> vcpus and ONLY idle_task left ---- The CPU_DEAD is the checkpoint.
> --2. We must wait the idle task to hit native_halt() in GUEST, till
> that time, this vcpu is no needed even by idle_task. In KVM, the vcpu
> thread will finally sit on "kvm_vcpu_block(vcpu);"
> We CAN NOT suppose the sequence of the two condition because they come
> from different threads.  Am I right?
> 
No, KVM can remove vcpu whenever it told to do so (may be not in the
middle of emulated io though). It is a guest responsibility to eject cpu
only when it is safe to do so from guest's point of view.

> And here comes my question,
> --1. I think the signal will make vcpu_run exit to user, but is it
> allow vcpu thread to finally call  "kernel/exit.c : void do_exit(long
> code)" in current code in kvm or in qemu?
Yes. Why not?

> --2. If we got CPU_DEAD event, and then send a signal to vcpu thread,
> could we ensure that we have already sit on "kvm_vcpu_block(vcpu);"
CPU_DEAD event is internal to a guest (one of them). KVM does not care
about it. And to remove vcpu it does not have to sit in kvm_vcpu_block().
And actually since signal kicks vcpu thread out from kernel into userspace
you can be sure it is not sitting in kvm_vcpu_block(). 

> 
> Thanks and regards,
> ping fan
> 
> > Also if guest "tells us that the vcpu is not needed any longer" (via
> > ACPI I presume) and vcpu actually doing something critical instead of
> > sitting in 1:hlt; jmp 1b loop then it is guest's problem if it stops
> > working after vcpu destruction.
> >
> 
> 
> > --
> >                        Gleb.
> >

--
			Gleb.

Patch

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ea2315a..7948eaf 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5825,11 +5825,27 @@  static int __vcpu_run(struct kvm_vcpu *vcpu)
 		    !vcpu->arch.apf.halted)
 			r = vcpu_enter_guest(vcpu);
 		else {
+retry:
+			if  (vcpu->arch.mp_state == KVM_MP_STATE_HALTED) {
+				/*1st check whether guest notify CPU_DEAD*/
+				if (vcpu->state == KVM_VCPU_STATE_DYING) {
+					vcpu->state = KVM_VCPU_STATE_DEAD;
+					vcpu->run->exit_reason = KVM_EXIT_VCPU_DEAD;
+					break;
+				}
+			}
 			srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
 			kvm_vcpu_block(vcpu);
 			vcpu->srcu_idx = srcu_read_lock(&kvm->srcu);
 			if (kvm_check_request(KVM_REQ_UNHALT, vcpu))
 			{
+				switch (vcpu->state) {
+				case KVM_VCPU_STATE_DYING:
+					r = 1;
+					goto retry;
+				default:
+					break;
+				}
 				switch(vcpu->arch.mp_state) {
 				case KVM_MP_STATE_HALTED:
 					vcpu->arch.mp_state =
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index c3892fc..d5ff3f7 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -162,6 +162,7 @@  struct kvm_pit_config {
 #define KVM_EXIT_INTERNAL_ERROR   17
 #define KVM_EXIT_OSI              18
 #define KVM_EXIT_PAPR_HCALL	  19
+#define KVM_EXIT_VCPU_DEAD              20
 
 /* For KVM_EXIT_INTERNAL_ERROR */
 #define KVM_INTERNAL_ERROR_EMULATION 1
@@ -334,6 +335,12 @@  struct kvm_signal_mask {
 	__u8  sigset[0];
 };
 
+/*for KVM_VCPU_SET_STATE */
+struct kvm_vcpu_state {
+	int vcpu_id;
+	int state;
+};
+
 /* for KVM_TPR_ACCESS_REPORTING */
 struct kvm_tpr_access_ctl {
 	__u32 enabled;
@@ -354,6 +361,9 @@  struct kvm_vapic_addr {
 #define KVM_MP_STATE_HALTED            3
 #define KVM_MP_STATE_SIPI_RECEIVED     4
 
+#define KVM_VCPU_STATE_DYING 1
+#define KVM_VCPU_STATE_DEAD 2
+
 struct kvm_mp_state {
 	__u32 mp_state;
 };
@@ -762,6 +772,7 @@  struct kvm_clock_data {
 #define KVM_CREATE_SPAPR_TCE	  _IOW(KVMIO,  0xa8, struct kvm_create_spapr_tce)
 /* Available with KVM_CAP_RMA */
 #define KVM_ALLOCATE_RMA	  _IOR(KVMIO,  0xa9, struct kvm_allocate_rma)
+#define KVM_SETSTATE_VCPU     _IOW(KVMIO,   0xaa, struct kvm_vcpu_state)
 
 #define KVM_DEV_ASSIGN_ENABLE_IOMMU	(1 << 0)
 
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index fe35078..6fdf927 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -114,6 +114,7 @@  enum {
 struct kvm_vcpu {
 	struct kvm *kvm;
 	struct kref refcount;
+	int state;
 #ifdef CONFIG_PREEMPT_NOTIFIERS
 	struct preempt_notifier preempt_notifier;
 #endif