Patchwork [v7.5] kvm: notify host when the guest is panicked

login
register
mail settings
Submitter Wen Congyang
Date July 21, 2012, 8:44 a.m.
Message ID <500A6BF1.4030002@cn.fujitsu.com>
Download mbox | patch
Permalink /patch/172414/
State New
Headers show

Comments

Wen Congyang - July 21, 2012, 8:44 a.m.
We can know the guest is panicked when the guest runs on xen.
But we do not have such feature on kvm.

Another purpose of this feature is: management app(for example:
libvirt) can do auto dump when the guest is panicked. If management
app does not do auto dump, the guest's user can do dump by hand if
he sees the guest is panicked.

We have three solutions to implement this feature:
1. use vmcall
2. use I/O port
3. use virtio-serial.

We have decided to avoid touching hypervisor. The reason why I choose
choose the I/O port is:
1. it is easier to implememt
2. it does not depend any virtual device
3. it can work when starting the kernel

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 arch/ia64/include/asm/kvm_para.h    |    5 +++++
 arch/powerpc/include/asm/kvm_para.h |    5 +++++
 arch/s390/include/asm/kvm_para.h    |    5 +++++
 arch/x86/include/asm/kvm_para.h     |   13 +++++++++++++
 arch/x86/kernel/kvm.c               |   14 ++++++++++++++
 include/linux/kvm_para.h            |   13 +++++++++++++
 6 files changed, 55 insertions(+), 0 deletions(-)
Sasha Levin - July 22, 2012, 11:39 a.m.
On 07/21/2012 10:44 AM, Wen Congyang wrote:
> We can know the guest is panicked when the guest runs on xen.
> But we do not have such feature on kvm.
> 
> Another purpose of this feature is: management app(for example:
> libvirt) can do auto dump when the guest is panicked. If management
> app does not do auto dump, the guest's user can do dump by hand if
> he sees the guest is panicked.
> 
> We have three solutions to implement this feature:
> 1. use vmcall
> 2. use I/O port
> 3. use virtio-serial.
> 
> We have decided to avoid touching hypervisor. The reason why I choose
> choose the I/O port is:
> 1. it is easier to implememt
> 2. it does not depend any virtual device
> 3. it can work when starting the kernel

Was the option of implementing a virtio-watchdog driver considered?

You're basically re-implementing a watchdog, a guest-host interface and a set of protocols for guest-host communications.

Why can't we re-use everything we have now, push a virtio watchdog driver into drivers/watchdog/, and gain a more complete solution to detecting hangs inside the guest.
Anthony Liguori - July 22, 2012, 7:14 p.m.
Sasha Levin <levinsasha928@gmail.com> writes:

> On 07/21/2012 10:44 AM, Wen Congyang wrote:
>> We can know the guest is panicked when the guest runs on xen.
>> But we do not have such feature on kvm.
>> 
>> Another purpose of this feature is: management app(for example:
>> libvirt) can do auto dump when the guest is panicked. If management
>> app does not do auto dump, the guest's user can do dump by hand if
>> he sees the guest is panicked.
>> 
>> We have three solutions to implement this feature:
>> 1. use vmcall
>> 2. use I/O port
>> 3. use virtio-serial.
>> 
>> We have decided to avoid touching hypervisor. The reason why I choose
>> choose the I/O port is:
>> 1. it is easier to implememt
>> 2. it does not depend any virtual device
>> 3. it can work when starting the kernel
>
> Was the option of implementing a virtio-watchdog driver considered?
>
> You're basically re-implementing a watchdog, a guest-host interface and a set of protocols for guest-host communications.
>
> Why can't we re-use everything we have now, push a virtio watchdog
> driver into drivers/watchdog/, and gain a more complete solution to
> detecting hangs inside the guest.

The purpose of virtio is not to reinvent every possible type of device.
There are plenty of hardware watchdogs that are very suitable to be used
for this purpose.  QEMU implements quite a few already.

Watchdogs are not performance sensitive so there's no point in using
virtio.

Regards,

Anthony Liguori
Sasha Levin - July 22, 2012, 8:03 p.m.
On 07/22/2012 09:14 PM, Anthony Liguori wrote:
> Sasha Levin <levinsasha928@gmail.com> writes:
> 
>> On 07/21/2012 10:44 AM, Wen Congyang wrote:
>>> We can know the guest is panicked when the guest runs on xen.
>>> But we do not have such feature on kvm.
>>>
>>> Another purpose of this feature is: management app(for example:
>>> libvirt) can do auto dump when the guest is panicked. If management
>>> app does not do auto dump, the guest's user can do dump by hand if
>>> he sees the guest is panicked.
>>>
>>> We have three solutions to implement this feature:
>>> 1. use vmcall
>>> 2. use I/O port
>>> 3. use virtio-serial.
>>>
>>> We have decided to avoid touching hypervisor. The reason why I choose
>>> choose the I/O port is:
>>> 1. it is easier to implememt
>>> 2. it does not depend any virtual device
>>> 3. it can work when starting the kernel
>>
>> Was the option of implementing a virtio-watchdog driver considered?
>>
>> You're basically re-implementing a watchdog, a guest-host interface and a set of protocols for guest-host communications.
>>
>> Why can't we re-use everything we have now, push a virtio watchdog
>> driver into drivers/watchdog/, and gain a more complete solution to
>> detecting hangs inside the guest.
> 
> The purpose of virtio is not to reinvent every possible type of device.
> There are plenty of hardware watchdogs that are very suitable to be used
> for this purpose.  QEMU implements quite a few already.
> 
> Watchdogs are not performance sensitive so there's no point in using
> virtio.

The issue here is not performance, but the adding of a brand new guest-host interface.

virtio-rng isn't performance sensitive either, yet it was implemented using virtio so there wouldn't be yet another interface to communicate between guest and host.

This patch goes ahead to add a "arch pv features" interface using ioports, without any idea what it might be used for beyond this watchdog.
Anthony Liguori - July 22, 2012, 10:36 p.m.
Sasha Levin <levinsasha928@gmail.com> writes:

> On 07/22/2012 09:14 PM, Anthony Liguori wrote:
>> Sasha Levin <levinsasha928@gmail.com> writes:
>> 
>>> On 07/21/2012 10:44 AM, Wen Congyang wrote:
>>>> We can know the guest is panicked when the guest runs on xen.
>>>> But we do not have such feature on kvm.
>>>>
>>>> Another purpose of this feature is: management app(for example:
>>>> libvirt) can do auto dump when the guest is panicked. If management
>>>> app does not do auto dump, the guest's user can do dump by hand if
>>>> he sees the guest is panicked.
>>>>
>>>> We have three solutions to implement this feature:
>>>> 1. use vmcall
>>>> 2. use I/O port
>>>> 3. use virtio-serial.
>>>>
>>>> We have decided to avoid touching hypervisor. The reason why I choose
>>>> choose the I/O port is:
>>>> 1. it is easier to implememt
>>>> 2. it does not depend any virtual device
>>>> 3. it can work when starting the kernel
>>>
>>> Was the option of implementing a virtio-watchdog driver considered?
>>>
>>> You're basically re-implementing a watchdog, a guest-host interface and a set of protocols for guest-host communications.
>>>
>>> Why can't we re-use everything we have now, push a virtio watchdog
>>> driver into drivers/watchdog/, and gain a more complete solution to
>>> detecting hangs inside the guest.
>> 
>> The purpose of virtio is not to reinvent every possible type of device.
>> There are plenty of hardware watchdogs that are very suitable to be used
>> for this purpose.  QEMU implements quite a few already.
>> 
>> Watchdogs are not performance sensitive so there's no point in using
>> virtio.
>
> The issue here is not performance, but the adding of a brand new
> guest-host interface.

We have:

1) Virtio--this is our preferred PV interface.  It needs PCI to be fully
initialized and probably will live as a module.

2) Hypercalls--this a secondary PV interface but is available very
early.  It's terminated in kvm.ko which means it can only operate on
things that are logically part of the CPU and/or APIC complex.

This patch introduces a third interface which is available early like
hypercalls but not necessarily terminated in kvm.ko.  That means it can
have a broader scope in functionality than (2).

We could just as well use a hypercall and have multiple commands issued
to that hypercall as a convention and add a new exit type to KVM that
sent that specific hypercall to userspace for processing.

But a PIO operation already has this behavior and requires no changes to kvm.ko.

> virtio-rng isn't performance sensitive either, yet it was implemented
> using virtio so there wouldn't be yet another interface to communicate
> between guest and host.

There isn't really an obvious discrete RNG that is widely supported.

> This patch goes ahead to add a "arch pv features" interface using
> ioports, without any idea what it might be used for beyond this
> watchdog.

It's not a watchdog--it's the opposite of a watchdog.

You know such a thing already exists in the kernel, right?  S390 has had
a hypercall like this for years.

Regards,

Anthony Liguori
Sasha Levin - July 22, 2012, 11:50 p.m.
On 07/23/2012 12:36 AM, Anthony Liguori wrote:
> Sasha Levin <levinsasha928@gmail.com> writes:
> 
>> On 07/22/2012 09:14 PM, Anthony Liguori wrote:
>>> Sasha Levin <levinsasha928@gmail.com> writes:
>>>
>>>> On 07/21/2012 10:44 AM, Wen Congyang wrote:
>>>>> We can know the guest is panicked when the guest runs on xen.
>>>>> But we do not have such feature on kvm.
>>>>>
>>>>> Another purpose of this feature is: management app(for example:
>>>>> libvirt) can do auto dump when the guest is panicked. If management
>>>>> app does not do auto dump, the guest's user can do dump by hand if
>>>>> he sees the guest is panicked.
>>>>>
>>>>> We have three solutions to implement this feature:
>>>>> 1. use vmcall
>>>>> 2. use I/O port
>>>>> 3. use virtio-serial.
>>>>>
>>>>> We have decided to avoid touching hypervisor. The reason why I choose
>>>>> choose the I/O port is:
>>>>> 1. it is easier to implememt
>>>>> 2. it does not depend any virtual device
>>>>> 3. it can work when starting the kernel
>>>>
>>>> Was the option of implementing a virtio-watchdog driver considered?
>>>>
>>>> You're basically re-implementing a watchdog, a guest-host interface and a set of protocols for guest-host communications.
>>>>
>>>> Why can't we re-use everything we have now, push a virtio watchdog
>>>> driver into drivers/watchdog/, and gain a more complete solution to
>>>> detecting hangs inside the guest.
>>>
>>> The purpose of virtio is not to reinvent every possible type of device.
>>> There are plenty of hardware watchdogs that are very suitable to be used
>>> for this purpose.  QEMU implements quite a few already.
>>>
>>> Watchdogs are not performance sensitive so there's no point in using
>>> virtio.
>>
>> The issue here is not performance, but the adding of a brand new
>> guest-host interface.
> 
> We have:
> 
> 1) Virtio--this is our preferred PV interface.  It needs PCI to be fully
> initialized and probably will live as a module.
> 
> 2) Hypercalls--this a secondary PV interface but is available very
> early.  It's terminated in kvm.ko which means it can only operate on
> things that are logically part of the CPU and/or APIC complex.
> 
> This patch introduces a third interface which is available early like
> hypercalls but not necessarily terminated in kvm.ko.  That means it can
> have a broader scope in functionality than (2).
> 
> We could just as well use a hypercall and have multiple commands issued
> to that hypercall as a convention and add a new exit type to KVM that
> sent that specific hypercall to userspace for processing.
> 
> But a PIO operation already has this behavior and requires no changes to kvm.ko.

I don't dispute that there may be a need for another guest-host interface, but this patch can basically be called "kvm: notify host when the guest is panicked, oh, btw, and add a brand new undocumented interface"

The new interface should at least come in it's own patch, with documentation.
Wen Congyang - July 23, 2012, 2:07 a.m.
At 07/22/2012 07:39 PM, Sasha Levin Wrote:
> On 07/21/2012 10:44 AM, Wen Congyang wrote:
>> We can know the guest is panicked when the guest runs on xen.
>> But we do not have such feature on kvm.
>>
>> Another purpose of this feature is: management app(for example:
>> libvirt) can do auto dump when the guest is panicked. If management
>> app does not do auto dump, the guest's user can do dump by hand if
>> he sees the guest is panicked.
>>
>> We have three solutions to implement this feature:
>> 1. use vmcall
>> 2. use I/O port
>> 3. use virtio-serial.
>>
>> We have decided to avoid touching hypervisor. The reason why I choose
>> choose the I/O port is:
>> 1. it is easier to implememt
>> 2. it does not depend any virtual device
>> 3. it can work when starting the kernel
> 
> Was the option of implementing a virtio-watchdog driver considered?

virtio-watchdog? What is this? I don't find it in qemu. Do I miss something?

Another reason why we don't use this:
If the watchdog timeouts, we cannot say the kernel is panicked. For
example, the kernel is hung, or the kernel is deadlock, or ...
the watchdog daemon can not have chance to touch watchdog device.

Thanks
Wen Congyang

> 
> You're basically re-implementing a watchdog, a guest-host interface and a set of protocols for guest-host communications.
> 
> Why can't we re-use everything we have now, push a virtio watchdog driver into drivers/watchdog/, and gain a more complete solution to detecting hangs inside the guest.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>
Wen Congyang - July 23, 2012, 2:08 a.m.
At 07/23/2012 07:50 AM, Sasha Levin Wrote:
> On 07/23/2012 12:36 AM, Anthony Liguori wrote:
>> Sasha Levin <levinsasha928@gmail.com> writes:
>>
>>> On 07/22/2012 09:14 PM, Anthony Liguori wrote:
>>>> Sasha Levin <levinsasha928@gmail.com> writes:
>>>>
>>>>> On 07/21/2012 10:44 AM, Wen Congyang wrote:
>>>>>> We can know the guest is panicked when the guest runs on xen.
>>>>>> But we do not have such feature on kvm.
>>>>>>
>>>>>> Another purpose of this feature is: management app(for example:
>>>>>> libvirt) can do auto dump when the guest is panicked. If management
>>>>>> app does not do auto dump, the guest's user can do dump by hand if
>>>>>> he sees the guest is panicked.
>>>>>>
>>>>>> We have three solutions to implement this feature:
>>>>>> 1. use vmcall
>>>>>> 2. use I/O port
>>>>>> 3. use virtio-serial.
>>>>>>
>>>>>> We have decided to avoid touching hypervisor. The reason why I choose
>>>>>> choose the I/O port is:
>>>>>> 1. it is easier to implememt
>>>>>> 2. it does not depend any virtual device
>>>>>> 3. it can work when starting the kernel
>>>>>
>>>>> Was the option of implementing a virtio-watchdog driver considered?
>>>>>
>>>>> You're basically re-implementing a watchdog, a guest-host interface and a set of protocols for guest-host communications.
>>>>>
>>>>> Why can't we re-use everything we have now, push a virtio watchdog
>>>>> driver into drivers/watchdog/, and gain a more complete solution to
>>>>> detecting hangs inside the guest.
>>>>
>>>> The purpose of virtio is not to reinvent every possible type of device.
>>>> There are plenty of hardware watchdogs that are very suitable to be used
>>>> for this purpose.  QEMU implements quite a few already.
>>>>
>>>> Watchdogs are not performance sensitive so there's no point in using
>>>> virtio.
>>>
>>> The issue here is not performance, but the adding of a brand new
>>> guest-host interface.
>>
>> We have:
>>
>> 1) Virtio--this is our preferred PV interface.  It needs PCI to be fully
>> initialized and probably will live as a module.
>>
>> 2) Hypercalls--this a secondary PV interface but is available very
>> early.  It's terminated in kvm.ko which means it can only operate on
>> things that are logically part of the CPU and/or APIC complex.
>>
>> This patch introduces a third interface which is available early like
>> hypercalls but not necessarily terminated in kvm.ko.  That means it can
>> have a broader scope in functionality than (2).
>>
>> We could just as well use a hypercall and have multiple commands issued
>> to that hypercall as a convention and add a new exit type to KVM that
>> sent that specific hypercall to userspace for processing.
>>
>> But a PIO operation already has this behavior and requires no changes to kvm.ko.
> 
> I don't dispute that there may be a need for another guest-host interface, but this patch can basically be called "kvm: notify host when the guest is panicked, oh, btw, and add a brand new undocumented interface"

I forgot to document this interface. I will add it.

Thanks
Wen Congyang

> 
> The new interface should at least come in it's own patch, with documentation.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

Patch

diff --git a/arch/ia64/include/asm/kvm_para.h b/arch/ia64/include/asm/kvm_para.h
index 2019cb9..187c0e2 100644
--- a/arch/ia64/include/asm/kvm_para.h
+++ b/arch/ia64/include/asm/kvm_para.h
@@ -31,6 +31,11 @@  static inline bool kvm_check_and_clear_guest_paused(void)
 	return false;
 }
 
+static inline unsigned int kvm_arch_pv_features(void)
+{
+	return 0;
+}
+
 #endif
 
 #endif
diff --git a/arch/powerpc/include/asm/kvm_para.h b/arch/powerpc/include/asm/kvm_para.h
index c18916b..be81aac 100644
--- a/arch/powerpc/include/asm/kvm_para.h
+++ b/arch/powerpc/include/asm/kvm_para.h
@@ -211,6 +211,11 @@  static inline bool kvm_check_and_clear_guest_paused(void)
 	return false;
 }
 
+static inline unsigned int kvm_arch_pv_features(void)
+{
+	return 0;
+}
+
 #endif /* __KERNEL__ */
 
 #endif /* __POWERPC_KVM_PARA_H__ */
diff --git a/arch/s390/include/asm/kvm_para.h b/arch/s390/include/asm/kvm_para.h
index a988329..3d993b7 100644
--- a/arch/s390/include/asm/kvm_para.h
+++ b/arch/s390/include/asm/kvm_para.h
@@ -154,6 +154,11 @@  static inline bool kvm_check_and_clear_guest_paused(void)
 	return false;
 }
 
+static inline unsigned int kvm_arch_pv_features(void)
+{
+	return 0;
+}
+
 #endif
 
 #endif /* __S390_KVM_PARA_H */
diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
index 63ab166..647561b 100644
--- a/arch/x86/include/asm/kvm_para.h
+++ b/arch/x86/include/asm/kvm_para.h
@@ -89,6 +89,8 @@  struct kvm_vcpu_pv_apf_data {
 	__u32 enabled;
 };
 
+#define KVM_PV_PORT	(0x505UL)
+
 #ifdef __KERNEL__
 #include <asm/processor.h>
 
@@ -221,6 +223,17 @@  static inline void kvm_disable_steal_time(void)
 }
 #endif
 
+static inline unsigned int kvm_arch_pv_features(void)
+{
+	unsigned int features = inl(KVM_PV_PORT);
+
+	/* Reading from an invalid I/O port will return -1 */
+	if (features == ~0)
+		features = 0;
+
+	return features;
+}
+
 #endif /* __KERNEL__ */
 
 #endif /* _ASM_X86_KVM_PARA_H */
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index e554e5a..9a97f7e 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -328,6 +328,17 @@  static struct notifier_block kvm_pv_reboot_nb = {
 	.notifier_call = kvm_pv_reboot_notify,
 };
 
+static int
+kvm_pv_panic_notify(struct notifier_block *nb, unsigned long code, void *unused)
+{
+	outl(KVM_PV_PANICKED, KVM_PV_PORT);
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block kvm_pv_panic_nb = {
+	.notifier_call = kvm_pv_panic_notify,
+};
+
 static u64 kvm_steal_clock(int cpu)
 {
 	u64 steal;
@@ -414,6 +425,9 @@  void __init kvm_guest_init(void)
 
 	paravirt_ops_setup();
 	register_reboot_notifier(&kvm_pv_reboot_nb);
+	if (kvm_pv_has_feature(KVM_PV_FEATURE_PANICKED))
+		atomic_notifier_chain_register(&panic_notifier_list,
+			&kvm_pv_panic_nb);
 	for (i = 0; i < KVM_TASK_SLEEP_HASHSIZE; i++)
 		spin_lock_init(&async_pf_sleepers[i].lock);
 	if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF))
diff --git a/include/linux/kvm_para.h b/include/linux/kvm_para.h
index ff476dd..e73efcf 100644
--- a/include/linux/kvm_para.h
+++ b/include/linux/kvm_para.h
@@ -20,6 +20,12 @@ 
 #define KVM_HC_FEATURES			3
 #define KVM_HC_PPC_MAP_MAGIC_PAGE	4
 
+/* The bit of the value read from KVM_PV_PORT */
+#define KVM_PV_FEATURE_PANICKED	0
+
+/* The value writen to KVM_PV_PORT */
+#define KVM_PV_PANICKED		1
+
 /*
  * hypercalls use architecture specific
  */
@@ -33,5 +39,12 @@  static inline int kvm_para_has_feature(unsigned int feature)
 		return 1;
 	return 0;
 }
+
+static inline int kvm_pv_has_feature(unsigned int feature)
+{
+	if (kvm_arch_pv_features() & (1UL << feature))
+		return 1;
+	return 0;
+}
 #endif /* __KERNEL__ */
 #endif /* __LINUX_KVM_PARA_H */