Message ID | 4EA68790.4020606@cn.fujitsu.com |
---|---|
State | New |
Headers | show |
On Tue, Oct 25, 2011 at 05:55:28PM +0800, Lai Jiangshan wrote: > Previous discussions------------: > >> >> > >> >> Which approach you prefer to? > >> >> I need to know the result before wasting too much time to respin > >> >> the approach. > > > > > > Yes, sorry about the slow and sometimes conflicting feedback. > > > > >> >> 1) Fix KVM_NMI emulation approach (which is v3 patchset) > >> >> - It directly fixes the problem and matches the > >> >> real hard ware more, but it changes KVM_NMI bahavior. > >> >> - Require both kernel-site and userspace-site fix. > >> >> > >> >> 2) Get the LAPIC state from kernel irqchip, and inject NMI if it is allowed > >> >> (which is v4 patchset) > >> >> - Simple, don't changes any kernel behavior. > >> >> - Only need the userspace-site fix > >> >> > >> >> 3) Add KVM_SET_LINT1 approach (which is v5 patchset) > >> >> - don't changes the kernel's KVM_NMI behavior. > >> >> - much complex > >> >> - Require both kernel-site and userspace-site fix. > >> >> - userspace-site should also handle the !KVM_SET_LINT1 > >> >> condition, it uses all the 2) approach' code. it means > >> >> this approach equals the 2) approach + KVM_SET_LINT1 ioctl. > >> >> > >> >> This is an urgent bug of us, we need to settle it down soo > > > > > > While (1) is simple, it overloads a single ioctl with two meanings, > > > that's not so good. > > > > > > Whether we do (1) or (3), we need (2) as well, for older kernels. > > > > > > So I recommend first focusing on (2) and merging it, then doing (3). > > > > > > (note an additional issue with 3 is whether to make it a vm or vcpu > > > ioctl - we've been assuming vcpu ioctl but it's not necessarily the best > > > choice). > > > > It is the 2) approach. > It only changes the user space site, the kernel site is not touched. > It is changed from previous v4 patch, fixed problems found by Jan. > ----------------------------end previous discussions > > > From: Lai Jiangshan <laijs@cn.fujitsu.com> > > > Currently, NMI interrupt is blindly sent to all the vCPUs when NMI > button event happens. This doesn't properly emulate real hardware on > which NMI button event triggers LINT1. Because of this, NMI is sent to > the processor even when LINT1 is maskied in LVT. For example, this > causes the problem that kdump initiated by NMI sometimes doesn't work > on KVM, because kdump assumes NMI is masked on CPUs other than CPU0. > > With this patch, inject-nmi request is handled as follows. > > - When in-kernel irqchip is disabled, deliver LINT1 instead of NMI > interrupt. > - When in-kernel irqchip is enabled, get the in-kernel LAPIC states > and test the APIC_LVT_MASKED, if LINT1 is unmasked, and then > delivering the NMI directly. (Suggested by Jan Kiszka) > > Changed from old version: > re-implement it by the Jan's suggestion. > fix the race found by Jan. > > Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> > Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> > Acked-by: Avi Kivity <avi@redhat.com> > Acked-by: Jan Kiszka <jan.kiszka@web.de> Please rebase.
diff --git a/hw/apic.c b/hw/apic.c index 69d6ac5..922796a 100644 --- a/hw/apic.c +++ b/hw/apic.c @@ -205,6 +205,39 @@ void apic_deliver_pic_intr(DeviceState *d, int level) } } +static inline uint32_t kapic_reg(struct kvm_lapic_state *kapic, int reg_id); + +static void kvm_irqchip_deliver_nmi(void *p) +{ + APICState *s = p; + struct kvm_lapic_state klapic; + uint32_t lvt; + + kvm_get_lapic(s->cpu_env, &klapic); + lvt = kapic_reg(&klapic, 0x32 + APIC_LVT_LINT1); + + if (lvt & APIC_LVT_MASKED) { + return; + } + + if (((lvt >> 8) & 7) != APIC_DM_NMI) { + return; + } + + kvm_vcpu_ioctl(s->cpu_env, KVM_NMI); +} + +void apic_deliver_nmi(DeviceState *d) +{ + APICState *s = DO_UPCAST(APICState, busdev.qdev, d); + + if (kvm_irqchip_in_kernel()) { + run_on_cpu(s->cpu_env, kvm_irqchip_deliver_nmi, s); + } else { + apic_local_deliver(s, APIC_LVT_LINT1); + } +} + #define foreach_apic(apic, deliver_bitmask, code) \ {\ int __i, __j, __mask;\ diff --git a/hw/apic.h b/hw/apic.h index c857d52..3a4be0a 100644 --- a/hw/apic.h +++ b/hw/apic.h @@ -10,6 +10,7 @@ void apic_deliver_irq(uint8_t dest, uint8_t dest_mode, uint8_t trigger_mode); int apic_accept_pic_intr(DeviceState *s); void apic_deliver_pic_intr(DeviceState *s, int level); +void apic_deliver_nmi(DeviceState *d); int apic_get_interrupt(DeviceState *s); void apic_reset_irq_delivered(void); int apic_get_irq_delivered(void); diff --git a/monitor.c b/monitor.c index cb485bf..0b81f17 100644 --- a/monitor.c +++ b/monitor.c @@ -2616,7 +2616,11 @@ static int do_inject_nmi(Monitor *mon, const QDict *qdict, QObject **ret_data) CPUState *env; for (env = first_cpu; env != NULL; env = env->next_cpu) { - cpu_interrupt(env, CPU_INTERRUPT_NMI); + if (!env->apic_state) { + cpu_interrupt(env, CPU_INTERRUPT_NMI); + } else { + apic_deliver_nmi(env->apic_state); + } } return 0;