Message ID | 1428330967-8394-1-git-send-email-chris.j.arges@canonical.com |
---|---|
State | New |
Headers | show |
On 04/06/2015 08:36 AM, Chris J Arges wrote: > From: Bandan Das <bsd@redhat.com> > > BugLink: http://bugs.launchpad.net/bugs/1413540 > > With commit b6b8a1451fc40412c57d1 that introduced > vmx_check_nested_events, checks for injectable interrupts happen > at different points in time for L1 and L2 that could potentially > cause a race. The regression occurs because KVM_REQ_EVENT is always > set when nested_run_pending is set even if there's no pending interrupt. > Consequently, there could be a small window when check_nested_events > returns without exiting to L1, but an interrupt comes through soon > after and it incorrectly, gets injected to L2 by inject_pending_event > Fix this by adding a call to check for nested events too when a check > for injectable interrupt returns true > > Signed-off-by: Bandan Das <bsd@redhat.com> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > (cherry picked from commit 9242b5b60df8b13b469bc6b7be08ff6ebb551ad3) > Signed-off-by: Chris J Arges <chris.j.arges@canonical.com> > --- > arch/x86/kvm/x86.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 677c66a..2515bc8 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -5834,6 +5834,18 @@ static int inject_pending_event(struct kvm_vcpu *vcpu, bool req_int_win) > kvm_x86_ops->set_nmi(vcpu); > } > } else if (kvm_cpu_has_injectable_intr(vcpu)) { > + /* > + * Because interrupts can be injected asynchronously, we are > + * calling check_nested_events again here to avoid a race condition. > + * See https://lkml.org/lkml/2014/7/2/60 for discussion about this > + * proposal and current concerns. Perhaps we should be setting > + * KVM_REQ_EVENT only on certain events and not unconditionally? > + */ > + if (is_guest_mode(vcpu) && kvm_x86_ops->check_nested_events) { > + r = kvm_x86_ops->check_nested_events(vcpu, req_int_win); > + if (r != 0) > + return r; > + } > if (kvm_x86_ops->interrupt_allowed(vcpu)) { > kvm_queue_interrupt(vcpu, kvm_cpu_get_interrupt(vcpu), > false); > Unless I'm missing something in the bug report, there is no information there that indicates this patch really does mitigate the soft lockup. It looks there are a couple of reporters that would be willing to run a test kernel and provide some feedback. rtg
On 04/06/2015 10:17 AM, Tim Gardner wrote: > On 04/06/2015 08:36 AM, Chris J Arges wrote: >> From: Bandan Das <bsd@redhat.com> >> >> BugLink: http://bugs.launchpad.net/bugs/1413540 >> >> With commit b6b8a1451fc40412c57d1 that introduced >> vmx_check_nested_events, checks for injectable interrupts happen >> at different points in time for L1 and L2 that could potentially >> cause a race. The regression occurs because KVM_REQ_EVENT is always >> set when nested_run_pending is set even if there's no pending interrupt. >> Consequently, there could be a small window when check_nested_events >> returns without exiting to L1, but an interrupt comes through soon >> after and it incorrectly, gets injected to L2 by inject_pending_event >> Fix this by adding a call to check for nested events too when a check >> for injectable interrupt returns true >> >> Signed-off-by: Bandan Das <bsd@redhat.com> >> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> >> (cherry picked from commit 9242b5b60df8b13b469bc6b7be08ff6ebb551ad3) >> Signed-off-by: Chris J Arges <chris.j.arges@canonical.com> >> --- >> arch/x86/kvm/x86.c | 12 ++++++++++++ >> 1 file changed, 12 insertions(+) >> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >> index 677c66a..2515bc8 100644 >> --- a/arch/x86/kvm/x86.c >> +++ b/arch/x86/kvm/x86.c >> @@ -5834,6 +5834,18 @@ static int inject_pending_event(struct kvm_vcpu >> *vcpu, bool req_int_win) >> kvm_x86_ops->set_nmi(vcpu); >> } >> } else if (kvm_cpu_has_injectable_intr(vcpu)) { >> + /* >> + * Because interrupts can be injected asynchronously, we are >> + * calling check_nested_events again here to avoid a race >> condition. >> + * See https://lkml.org/lkml/2014/7/2/60 for discussion about >> this >> + * proposal and current concerns. Perhaps we should be setting >> + * KVM_REQ_EVENT only on certain events and not unconditionally? >> + */ >> + if (is_guest_mode(vcpu) && kvm_x86_ops->check_nested_events) { >> + r = kvm_x86_ops->check_nested_events(vcpu, req_int_win); >> + if (r != 0) >> + return r; >> + } >> if (kvm_x86_ops->interrupt_allowed(vcpu)) { >> kvm_queue_interrupt(vcpu, kvm_cpu_get_interrupt(vcpu), >> false); >> > > Unless I'm missing something in the bug report, there is no information > there that indicates this patch really does mitigate the soft lockup. It > looks there are a couple of reporters that would be willing to run a > test kernel and provide some feedback. > > rtg Tim, I can reproduce this issue at will, and have done the testing to show this reduces the occurrences of softlockups. In addition, the patch itsself references a patch we already have on the 3.13 distro kernel (b6b8a1451fc40412c57d1). --chris
Based on the info about having already added the patch this new one references and that this was tested to at least reduce the lockups. Another sad incident of a feature which maybe after all should not have been declared working... :/# -Stefan
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 677c66a..2515bc8 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -5834,6 +5834,18 @@ static int inject_pending_event(struct kvm_vcpu *vcpu, bool req_int_win) kvm_x86_ops->set_nmi(vcpu); } } else if (kvm_cpu_has_injectable_intr(vcpu)) { + /* + * Because interrupts can be injected asynchronously, we are + * calling check_nested_events again here to avoid a race condition. + * See https://lkml.org/lkml/2014/7/2/60 for discussion about this + * proposal and current concerns. Perhaps we should be setting + * KVM_REQ_EVENT only on certain events and not unconditionally? + */ + if (is_guest_mode(vcpu) && kvm_x86_ops->check_nested_events) { + r = kvm_x86_ops->check_nested_events(vcpu, req_int_win); + if (r != 0) + return r; + } if (kvm_x86_ops->interrupt_allowed(vcpu)) { kvm_queue_interrupt(vcpu, kvm_cpu_get_interrupt(vcpu), false);