Message ID | 4E97D85C.7070107@cn.fujitsu.com |
---|---|
State | New |
Headers | show |
On 2011-10-14 08:36, Lai Jiangshan wrote: > On 10/14/2011 01:53 PM, Jan Kiszka wrote: >> On 2011-10-14 02:53, Lai Jiangshan wrote: >>> >>>> >>>> As explained in some other mail, we could then emulate the missing >>>> kernel feature by reading out the current in-kernel APIC state, testing >>>> if LINT1 is unmasked, and then delivering the NMI directly. >>>> >>> >>> Only the thread of the VCPU can safely get the in-kernel LAPIC states, >>> so this approach will cause some troubles. >> >> run_on_cpu() can help. >> >> Jan >> > > Ah, I forgot it, Thanks. > > From: Lai Jiangshan <laijs@cn.fujitsu.com> > > Currently, NMI interrupt is blindly sent to all the vCPUs when NMI > button event happens. This doesn't properly emulate real hardware on > which NMI button event triggers LINT1. Because of this, NMI is sent to > the processor even when LINT1 is maskied in LVT. For example, this > causes the problem that kdump initiated by NMI sometimes doesn't work > on KVM, because kdump assumes NMI is masked on CPUs other than CPU0. > > With this patch, inject-nmi request is handled as follows. > > - When in-kernel irqchip is disabled, deliver LINT1 instead of NMI > interrupt. > - When in-kernel irqchip is enabled, get the in-kernel LAPIC states > and test the APIC_LVT_MASKED, if LINT1 is unmasked, and then > delivering the NMI directly. (Suggested by Jan Kiszka) > > Changed from old version: > re-implement it by the Jan's suggestion. > > Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> > Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> > --- > hw/apic.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ > hw/apic.h | 1 + > monitor.c | 6 +++++- > 3 files changed, 54 insertions(+), 1 deletions(-) > diff --git a/hw/apic.c b/hw/apic.c > index 69d6ac5..9a40129 100644 > --- a/hw/apic.c > +++ b/hw/apic.c > @@ -205,6 +205,54 @@ void apic_deliver_pic_intr(DeviceState *d, int level) > } > } > > +#ifdef KVM_CAP_IRQCHIP Again, this is always defined on x86 thus pointless to test. > +static inline uint32_t kapic_reg(struct kvm_lapic_state *kapic, int reg_id); > + > +struct kvm_get_remote_lapic_params { > + CPUState *env; > + struct kvm_lapic_state klapic; > +}; > + > +static void kvm_get_remote_lapic(void *p) > +{ > + struct kvm_get_remote_lapic_params *params = p; > + > + kvm_get_lapic(params->env, ¶ms->klapic); When you already interrupted that vcpu, why not inject from here? Avoids one further ping-pong round. > +} > + > +void apic_deliver_nmi(DeviceState *d) > +{ > + APICState *s = DO_UPCAST(APICState, busdev.qdev, d); > + > + if (kvm_irqchip_in_kernel()) { > + struct kvm_get_remote_lapic_params p = {.env = s->cpu_env,}; > + uint32_t lvt; > + > + run_on_cpu(s->cpu_env, kvm_get_remote_lapic, &p); > + lvt = kapic_reg(&p.klapic, 0x32 + APIC_LVT_LINT1); > + > + if (lvt & APIC_LVT_MASKED) { > + return; > + } > + > + if (((lvt >> 8) & 7) != APIC_DM_NMI) { > + return; > + } > + > + cpu_interrupt(s->cpu_env, CPU_INTERRUPT_NMI); Err, aren't you introducing KVM_CAP_LAPIC_NMI that allows to test if this workaround is needed? Oh, your latest kernel patch is missing this again - requires fixing as well. Jan
On 10/14/2011 02:49 PM, Jan Kiszka wrote: > On 2011-10-14 08:36, Lai Jiangshan wrote: >> On 10/14/2011 01:53 PM, Jan Kiszka wrote: >>> On 2011-10-14 02:53, Lai Jiangshan wrote: >>>> >>>>> >>>>> As explained in some other mail, we could then emulate the missing >>>>> kernel feature by reading out the current in-kernel APIC state, testing >>>>> if LINT1 is unmasked, and then delivering the NMI directly. >>>>> >>>> >>>> Only the thread of the VCPU can safely get the in-kernel LAPIC states, >>>> so this approach will cause some troubles. >>> >>> run_on_cpu() can help. >>> >>> Jan >>> >> >> Ah, I forgot it, Thanks. >> >> From: Lai Jiangshan <laijs@cn.fujitsu.com> >> >> Currently, NMI interrupt is blindly sent to all the vCPUs when NMI >> button event happens. This doesn't properly emulate real hardware on >> which NMI button event triggers LINT1. Because of this, NMI is sent to >> the processor even when LINT1 is maskied in LVT. For example, this >> causes the problem that kdump initiated by NMI sometimes doesn't work >> on KVM, because kdump assumes NMI is masked on CPUs other than CPU0. >> >> With this patch, inject-nmi request is handled as follows. >> >> - When in-kernel irqchip is disabled, deliver LINT1 instead of NMI >> interrupt. >> - When in-kernel irqchip is enabled, get the in-kernel LAPIC states >> and test the APIC_LVT_MASKED, if LINT1 is unmasked, and then >> delivering the NMI directly. (Suggested by Jan Kiszka) >> >> Changed from old version: >> re-implement it by the Jan's suggestion. >> >> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> >> Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> >> --- >> hw/apic.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ >> hw/apic.h | 1 + >> monitor.c | 6 +++++- >> 3 files changed, 54 insertions(+), 1 deletions(-) >> diff --git a/hw/apic.c b/hw/apic.c >> index 69d6ac5..9a40129 100644 >> --- a/hw/apic.c >> +++ b/hw/apic.c >> @@ -205,6 +205,54 @@ void apic_deliver_pic_intr(DeviceState *d, int level) >> } >> } >> >> +#ifdef KVM_CAP_IRQCHIP > > Again, this is always defined on x86 thus pointless to test. > >> +static inline uint32_t kapic_reg(struct kvm_lapic_state *kapic, int reg_id); >> + >> +struct kvm_get_remote_lapic_params { >> + CPUState *env; >> + struct kvm_lapic_state klapic; >> +}; >> + >> +static void kvm_get_remote_lapic(void *p) >> +{ >> + struct kvm_get_remote_lapic_params *params = p; >> + >> + kvm_get_lapic(params->env, ¶ms->klapic); > > When you already interrupted that vcpu, why not inject from here? Avoids > one further ping-pong round. get_remote_lapic and inject nmi are two different things, so I don't inject nmi from here. I didn't notice this ping-pond overhead. Thank you. > >> +} >> + >> +void apic_deliver_nmi(DeviceState *d) >> +{ >> + APICState *s = DO_UPCAST(APICState, busdev.qdev, d); >> + >> + if (kvm_irqchip_in_kernel()) { >> + struct kvm_get_remote_lapic_params p = {.env = s->cpu_env,}; >> + uint32_t lvt; >> + >> + run_on_cpu(s->cpu_env, kvm_get_remote_lapic, &p); >> + lvt = kapic_reg(&p.klapic, 0x32 + APIC_LVT_LINT1); >> + >> + if (lvt & APIC_LVT_MASKED) { >> + return; >> + } >> + >> + if (((lvt >> 8) & 7) != APIC_DM_NMI) { >> + return; >> + } >> + >> + cpu_interrupt(s->cpu_env, CPU_INTERRUPT_NMI); > > Err, aren't you introducing KVM_CAP_LAPIC_NMI that allows to test if > this workaround is needed? Oh, your latest kernel patch is missing this > again - requires fixing as well. > Kernel site patch is dropped with this v4 patch. Did you mean you want KVM_CAP_SET_LINT1 + KVM_SET_LINT1 patches? I have made them. Sent soon. Thanks, Lai
On 2011-10-14 09:43, Lai Jiangshan wrote: > On 10/14/2011 02:49 PM, Jan Kiszka wrote: >> On 2011-10-14 08:36, Lai Jiangshan wrote: >>> On 10/14/2011 01:53 PM, Jan Kiszka wrote: >>>> On 2011-10-14 02:53, Lai Jiangshan wrote: >>>>> >>>>>> >>>>>> As explained in some other mail, we could then emulate the missing >>>>>> kernel feature by reading out the current in-kernel APIC state, testing >>>>>> if LINT1 is unmasked, and then delivering the NMI directly. >>>>>> >>>>> >>>>> Only the thread of the VCPU can safely get the in-kernel LAPIC states, >>>>> so this approach will cause some troubles. >>>> >>>> run_on_cpu() can help. >>>> >>>> Jan >>>> >>> >>> Ah, I forgot it, Thanks. >>> >>> From: Lai Jiangshan <laijs@cn.fujitsu.com> >>> >>> Currently, NMI interrupt is blindly sent to all the vCPUs when NMI >>> button event happens. This doesn't properly emulate real hardware on >>> which NMI button event triggers LINT1. Because of this, NMI is sent to >>> the processor even when LINT1 is maskied in LVT. For example, this >>> causes the problem that kdump initiated by NMI sometimes doesn't work >>> on KVM, because kdump assumes NMI is masked on CPUs other than CPU0. >>> >>> With this patch, inject-nmi request is handled as follows. >>> >>> - When in-kernel irqchip is disabled, deliver LINT1 instead of NMI >>> interrupt. >>> - When in-kernel irqchip is enabled, get the in-kernel LAPIC states >>> and test the APIC_LVT_MASKED, if LINT1 is unmasked, and then >>> delivering the NMI directly. (Suggested by Jan Kiszka) >>> >>> Changed from old version: >>> re-implement it by the Jan's suggestion. >>> >>> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> >>> Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> >>> --- >>> hw/apic.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ >>> hw/apic.h | 1 + >>> monitor.c | 6 +++++- >>> 3 files changed, 54 insertions(+), 1 deletions(-) >>> diff --git a/hw/apic.c b/hw/apic.c >>> index 69d6ac5..9a40129 100644 >>> --- a/hw/apic.c >>> +++ b/hw/apic.c >>> @@ -205,6 +205,54 @@ void apic_deliver_pic_intr(DeviceState *d, int level) >>> } >>> } >>> >>> +#ifdef KVM_CAP_IRQCHIP >> >> Again, this is always defined on x86 thus pointless to test. >> >>> +static inline uint32_t kapic_reg(struct kvm_lapic_state *kapic, int reg_id); >>> + >>> +struct kvm_get_remote_lapic_params { >>> + CPUState *env; >>> + struct kvm_lapic_state klapic; >>> +}; >>> + >>> +static void kvm_get_remote_lapic(void *p) >>> +{ >>> + struct kvm_get_remote_lapic_params *params = p; >>> + >>> + kvm_get_lapic(params->env, ¶ms->klapic); >> >> When you already interrupted that vcpu, why not inject from here? Avoids >> one further ping-pong round. > > get_remote_lapic and inject nmi are two different things, > so I don't inject nmi from here. I didn't notice this ping-pond overhead. > Thank you. Actually, it is not performance-critical. But there is a race between obtaining the APIC state and testing for the NMI injection path. So it's better to define an on-vcpu LINT1 NMI injection service. > >> >>> +} >>> + >>> +void apic_deliver_nmi(DeviceState *d) >>> +{ >>> + APICState *s = DO_UPCAST(APICState, busdev.qdev, d); >>> + >>> + if (kvm_irqchip_in_kernel()) { >>> + struct kvm_get_remote_lapic_params p = {.env = s->cpu_env,}; >>> + uint32_t lvt; >>> + >>> + run_on_cpu(s->cpu_env, kvm_get_remote_lapic, &p); >>> + lvt = kapic_reg(&p.klapic, 0x32 + APIC_LVT_LINT1); >>> + >>> + if (lvt & APIC_LVT_MASKED) { >>> + return; >>> + } >>> + >>> + if (((lvt >> 8) & 7) != APIC_DM_NMI) { >>> + return; >>> + } >>> + >>> + cpu_interrupt(s->cpu_env, CPU_INTERRUPT_NMI); >> >> Err, aren't you introducing KVM_CAP_LAPIC_NMI that allows to test if >> this workaround is needed? Oh, your latest kernel patch is missing this >> again - requires fixing as well. >> > > > Kernel site patch is dropped with this v4 patch. > > Did you mean you want KVM_CAP_SET_LINT1 + KVM_SET_LINT1 patches? > I have made them. OK, so this is going to be applied on top? Then I take this remark back. Jan
diff --git a/hw/apic.c b/hw/apic.c index 69d6ac5..9a40129 100644 --- a/hw/apic.c +++ b/hw/apic.c @@ -205,6 +205,54 @@ void apic_deliver_pic_intr(DeviceState *d, int level) } } +#ifdef KVM_CAP_IRQCHIP +static inline uint32_t kapic_reg(struct kvm_lapic_state *kapic, int reg_id); + +struct kvm_get_remote_lapic_params { + CPUState *env; + struct kvm_lapic_state klapic; +}; + +static void kvm_get_remote_lapic(void *p) +{ + struct kvm_get_remote_lapic_params *params = p; + + kvm_get_lapic(params->env, ¶ms->klapic); +} + +void apic_deliver_nmi(DeviceState *d) +{ + APICState *s = DO_UPCAST(APICState, busdev.qdev, d); + + if (kvm_irqchip_in_kernel()) { + struct kvm_get_remote_lapic_params p = {.env = s->cpu_env,}; + uint32_t lvt; + + run_on_cpu(s->cpu_env, kvm_get_remote_lapic, &p); + lvt = kapic_reg(&p.klapic, 0x32 + APIC_LVT_LINT1); + + if (lvt & APIC_LVT_MASKED) { + return; + } + + if (((lvt >> 8) & 7) != APIC_DM_NMI) { + return; + } + + cpu_interrupt(s->cpu_env, CPU_INTERRUPT_NMI); + } else { + apic_local_deliver(s, APIC_LVT_LINT1); + } +} +#else +void apic_deliver_nmi(DeviceState *d) +{ + APICState *s = DO_UPCAST(APICState, busdev.qdev, d); + + apic_local_deliver(s, APIC_LVT_LINT1); +} +#endif + #define foreach_apic(apic, deliver_bitmask, code) \ {\ int __i, __j, __mask;\ diff --git a/hw/apic.h b/hw/apic.h index c857d52..3a4be0a 100644 --- a/hw/apic.h +++ b/hw/apic.h @@ -10,6 +10,7 @@ void apic_deliver_irq(uint8_t dest, uint8_t dest_mode, uint8_t trigger_mode); int apic_accept_pic_intr(DeviceState *s); void apic_deliver_pic_intr(DeviceState *s, int level); +void apic_deliver_nmi(DeviceState *d); int apic_get_interrupt(DeviceState *s); void apic_reset_irq_delivered(void); int apic_get_irq_delivered(void); diff --git a/monitor.c b/monitor.c index cb485bf..0b81f17 100644 --- a/monitor.c +++ b/monitor.c @@ -2616,7 +2616,11 @@ static int do_inject_nmi(Monitor *mon, const QDict *qdict, QObject **ret_data) CPUState *env; for (env = first_cpu; env != NULL; env = env->next_cpu) { - cpu_interrupt(env, CPU_INTERRUPT_NMI); + if (!env->apic_state) { + cpu_interrupt(env, CPU_INTERRUPT_NMI); + } else { + apic_deliver_nmi(env->apic_state); + } } return 0;