Message ID | 4D418230.1010801@siemens.com |
---|---|
State | New |
Headers | show |
On 01/27/2011 04:33 PM, Jan Kiszka wrote: > Found by Stefan Hajnoczi: There is a race in kvm_cpu_exec between > checking for exit_request on vcpu entry and timer signals arriving > before KVM starts to catch them. Plug it by blocking both timer related > signals also on !CONFIG_IOTHREAD and process those via signalfd. > > As this fix depends on real signalfd support (otherwise the timer > signals only kick the compat helper thread, and the main thread hangs), > we need to detect the invalid constellation and abort configure. > > Signed-off-by: Jan Kiszka<jan.kiszka@siemens.com> > CC: Stefan Hajnoczi<stefanha@linux.vnet.ibm.com> > --- > > I don't want to invest that much into !IOTHREAD anymore, so let's see if > the proposed catch&abort is acceptable. > I don't understand the dependency on signalfd. The normal way of doing things, either waiting for the signal in sigtimedwait() or in ioctl(KVM_RUN), works with SIGALRM just fine.
On 2011-01-31 11:03, Avi Kivity wrote: > On 01/27/2011 04:33 PM, Jan Kiszka wrote: >> Found by Stefan Hajnoczi: There is a race in kvm_cpu_exec between >> checking for exit_request on vcpu entry and timer signals arriving >> before KVM starts to catch them. Plug it by blocking both timer related >> signals also on !CONFIG_IOTHREAD and process those via signalfd. >> >> As this fix depends on real signalfd support (otherwise the timer >> signals only kick the compat helper thread, and the main thread hangs), >> we need to detect the invalid constellation and abort configure. >> >> Signed-off-by: Jan Kiszka<jan.kiszka@siemens.com> >> CC: Stefan Hajnoczi<stefanha@linux.vnet.ibm.com> >> --- >> >> I don't want to invest that much into !IOTHREAD anymore, so let's see if >> the proposed catch&abort is acceptable. >> > > I don't understand the dependency on signalfd. The normal way of doing > things, either waiting for the signal in sigtimedwait() or in > ioctl(KVM_RUN), works with SIGALRM just fine. And how would you be kicked out of the select() call if it is waiting with a timeout? We only have a single thread here. The only alternative is Stefan's original proposal. But that required fiddling with the signal mask twice per KVM_RUN. Jan
On Mon, Jan 31, 2011 at 11:27 AM, Jan Kiszka <jan.kiszka@siemens.com> wrote: > On 2011-01-31 11:03, Avi Kivity wrote: >> On 01/27/2011 04:33 PM, Jan Kiszka wrote: >>> Found by Stefan Hajnoczi: There is a race in kvm_cpu_exec between >>> checking for exit_request on vcpu entry and timer signals arriving >>> before KVM starts to catch them. Plug it by blocking both timer related >>> signals also on !CONFIG_IOTHREAD and process those via signalfd. >>> >>> As this fix depends on real signalfd support (otherwise the timer >>> signals only kick the compat helper thread, and the main thread hangs), >>> we need to detect the invalid constellation and abort configure. >>> >>> Signed-off-by: Jan Kiszka<jan.kiszka@siemens.com> >>> CC: Stefan Hajnoczi<stefanha@linux.vnet.ibm.com> >>> --- >>> >>> I don't want to invest that much into !IOTHREAD anymore, so let's see if >>> the proposed catch&abort is acceptable. >>> >> >> I don't understand the dependency on signalfd. The normal way of doing >> things, either waiting for the signal in sigtimedwait() or in >> ioctl(KVM_RUN), works with SIGALRM just fine. > > And how would you be kicked out of the select() call if it is waiting > with a timeout? We only have a single thread here. > > The only alternative is Stefan's original proposal. But that required > fiddling with the signal mask twice per KVM_RUN. I think my original patch messed with the sigmask in the wrong place, as you mentioned doing it twice per KVM_RUN isn't a good idea. I wonder if we can enable SIGALRM only in blocking calls and guest code execution but without signalfd. It might be possible, I don't see an immediate problem with doing that, we might have to use pselect(2) or similar in a few places. Stefan
On 2011-01-31 13:13, Stefan Hajnoczi wrote: > On Mon, Jan 31, 2011 at 11:27 AM, Jan Kiszka <jan.kiszka@siemens.com> wrote: >> On 2011-01-31 11:03, Avi Kivity wrote: >>> On 01/27/2011 04:33 PM, Jan Kiszka wrote: >>>> Found by Stefan Hajnoczi: There is a race in kvm_cpu_exec between >>>> checking for exit_request on vcpu entry and timer signals arriving >>>> before KVM starts to catch them. Plug it by blocking both timer related >>>> signals also on !CONFIG_IOTHREAD and process those via signalfd. >>>> >>>> As this fix depends on real signalfd support (otherwise the timer >>>> signals only kick the compat helper thread, and the main thread hangs), >>>> we need to detect the invalid constellation and abort configure. >>>> >>>> Signed-off-by: Jan Kiszka<jan.kiszka@siemens.com> >>>> CC: Stefan Hajnoczi<stefanha@linux.vnet.ibm.com> >>>> --- >>>> >>>> I don't want to invest that much into !IOTHREAD anymore, so let's see if >>>> the proposed catch&abort is acceptable. >>>> >>> >>> I don't understand the dependency on signalfd. The normal way of doing >>> things, either waiting for the signal in sigtimedwait() or in >>> ioctl(KVM_RUN), works with SIGALRM just fine. >> >> And how would you be kicked out of the select() call if it is waiting >> with a timeout? We only have a single thread here. >> >> The only alternative is Stefan's original proposal. But that required >> fiddling with the signal mask twice per KVM_RUN. > > I think my original patch messed with the sigmask in the wrong place, > as you mentioned doing it twice per KVM_RUN isn't a good idea. I > wonder if we can enable SIGALRM only in blocking calls and guest code > execution but without signalfd. It might be possible, I don't see an > immediate problem with doing that, we might have to use pselect(2) or > similar in a few places. My main concern about alternative approaches is that IOTHREAD is about to become the default, and hardly anyone (of the few upstream KVM users) will run without it in the foreseeable future. The next step will be the removal of any !CONFIG_IOTHREAD section. So, how much do we want to invest here (provided my proposal has not remaining issues)? Jan
On 01/31/2011 01:27 PM, Jan Kiszka wrote: > On 2011-01-31 11:03, Avi Kivity wrote: > > On 01/27/2011 04:33 PM, Jan Kiszka wrote: > >> Found by Stefan Hajnoczi: There is a race in kvm_cpu_exec between > >> checking for exit_request on vcpu entry and timer signals arriving > >> before KVM starts to catch them. Plug it by blocking both timer related > >> signals also on !CONFIG_IOTHREAD and process those via signalfd. > >> > >> As this fix depends on real signalfd support (otherwise the timer > >> signals only kick the compat helper thread, and the main thread hangs), > >> we need to detect the invalid constellation and abort configure. > >> > >> Signed-off-by: Jan Kiszka<jan.kiszka@siemens.com> > >> CC: Stefan Hajnoczi<stefanha@linux.vnet.ibm.com> > >> --- > >> > >> I don't want to invest that much into !IOTHREAD anymore, so let's see if > >> the proposed catch&abort is acceptable. > >> > > > > I don't understand the dependency on signalfd. The normal way of doing > > things, either waiting for the signal in sigtimedwait() or in > > ioctl(KVM_RUN), works with SIGALRM just fine. > > And how would you be kicked out of the select() call if it is waiting > with a timeout? We only have a single thread here. If we use signalfd() (either kernel provided or thread+pipe), we kick out of select by select()ing it (though I don't see how it works without an iothread, since an fd can't stop a vcpu unless you enable SIGIO on it, which is silly for signalfd) If you leave it as a naked signal, then it can break out of either pselect() or vcpu. Since the goal is to drop !CONFIG_IOTHREAD, the first path seems better, I just don't understand the problem with emulated signalfd().
On Mon, Jan 31, 2011 at 12:18 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote: > On 2011-01-31 13:13, Stefan Hajnoczi wrote: >> On Mon, Jan 31, 2011 at 11:27 AM, Jan Kiszka <jan.kiszka@siemens.com> wrote: >>> On 2011-01-31 11:03, Avi Kivity wrote: >>>> On 01/27/2011 04:33 PM, Jan Kiszka wrote: >>>>> Found by Stefan Hajnoczi: There is a race in kvm_cpu_exec between >>>>> checking for exit_request on vcpu entry and timer signals arriving >>>>> before KVM starts to catch them. Plug it by blocking both timer related >>>>> signals also on !CONFIG_IOTHREAD and process those via signalfd. >>>>> >>>>> As this fix depends on real signalfd support (otherwise the timer >>>>> signals only kick the compat helper thread, and the main thread hangs), >>>>> we need to detect the invalid constellation and abort configure. >>>>> >>>>> Signed-off-by: Jan Kiszka<jan.kiszka@siemens.com> >>>>> CC: Stefan Hajnoczi<stefanha@linux.vnet.ibm.com> >>>>> --- >>>>> >>>>> I don't want to invest that much into !IOTHREAD anymore, so let's see if >>>>> the proposed catch&abort is acceptable. >>>>> >>>> >>>> I don't understand the dependency on signalfd. The normal way of doing >>>> things, either waiting for the signal in sigtimedwait() or in >>>> ioctl(KVM_RUN), works with SIGALRM just fine. >>> >>> And how would you be kicked out of the select() call if it is waiting >>> with a timeout? We only have a single thread here. >>> >>> The only alternative is Stefan's original proposal. But that required >>> fiddling with the signal mask twice per KVM_RUN. >> >> I think my original patch messed with the sigmask in the wrong place, >> as you mentioned doing it twice per KVM_RUN isn't a good idea. I >> wonder if we can enable SIGALRM only in blocking calls and guest code >> execution but without signalfd. It might be possible, I don't see an >> immediate problem with doing that, we might have to use pselect(2) or >> similar in a few places. > > My main concern about alternative approaches is that IOTHREAD is about > to become the default, and hardly anyone (of the few upstream KVM users) > will run without it in the foreseeable future. The next step will be the > removal of any !CONFIG_IOTHREAD section. So, how much do we want to > invest here (provided my proposal has not remaining issues)? Yes, you're right. I'm not volunteering to dig more into this, the best case would be to switch to a non-I/O thread world that works for everybody. Stefan
On 2011-01-31 14:22, Avi Kivity wrote: > On 01/31/2011 01:27 PM, Jan Kiszka wrote: >> On 2011-01-31 11:03, Avi Kivity wrote: >>> On 01/27/2011 04:33 PM, Jan Kiszka wrote: >>>> Found by Stefan Hajnoczi: There is a race in kvm_cpu_exec between >>>> checking for exit_request on vcpu entry and timer signals arriving >>>> before KVM starts to catch them. Plug it by blocking both timer related >>>> signals also on !CONFIG_IOTHREAD and process those via signalfd. >>>> >>>> As this fix depends on real signalfd support (otherwise the timer >>>> signals only kick the compat helper thread, and the main thread hangs), >>>> we need to detect the invalid constellation and abort configure. >>>> >>>> Signed-off-by: Jan Kiszka<jan.kiszka@siemens.com> >>>> CC: Stefan Hajnoczi<stefanha@linux.vnet.ibm.com> >>>> --- >>>> >>>> I don't want to invest that much into !IOTHREAD anymore, so let's see if >>>> the proposed catch&abort is acceptable. >>>> >>> >>> I don't understand the dependency on signalfd. The normal way of doing >>> things, either waiting for the signal in sigtimedwait() or in >>> ioctl(KVM_RUN), works with SIGALRM just fine. >> >> And how would you be kicked out of the select() call if it is waiting >> with a timeout? We only have a single thread here. > > If we use signalfd() (either kernel provided or thread+pipe), we kick > out of select by select()ing it (though I don't see how it works without > an iothread, since an fd can't stop a vcpu unless you enable SIGIO on > it, which is silly for signalfd) > > If you leave it as a naked signal, then it can break out of either > pselect() or vcpu. > > Since the goal is to drop !CONFIG_IOTHREAD, the first path seems better, > I just don't understand the problem with emulated signalfd(). > With the emulated signalfd, there won't be any signal for the VCPU while in KVM_RUN. Jan
On 01/31/2011 04:31 PM, Jan Kiszka wrote: > >> > >> And how would you be kicked out of the select() call if it is waiting > >> with a timeout? We only have a single thread here. > > > > If we use signalfd() (either kernel provided or thread+pipe), we kick > > out of select by select()ing it (though I don't see how it works without > > an iothread, since an fd can't stop a vcpu unless you enable SIGIO on > > it, which is silly for signalfd) > > > > If you leave it as a naked signal, then it can break out of either > > pselect() or vcpu. > > > > Since the goal is to drop !CONFIG_IOTHREAD, the first path seems better, > > I just don't understand the problem with emulated signalfd(). > > > > With the emulated signalfd, there won't be any signal for the VCPU while > in KVM_RUN. > I see it now - with a real signalfd, kvm unmasks the signal, and that takes precedence over signalfd and exits the vcpu.
diff --git a/configure b/configure index 4673bf0..368ca8a 100755 --- a/configure +++ b/configure @@ -2056,6 +2056,12 @@ EOF if compile_prog "" "" ; then signalfd=yes +elif test "$kvm" = "yes" -a "$io_thread" != "yes"; then + echo + echo "ERROR: Host kernel lacks signalfd() support," + echo "but KVM depends on it when the IO thread is disabled." + echo + exit 1 fi # check if eventfd is supported diff --git a/cpus.c b/cpus.c index fc3f222..f9d9f9e 100644 --- a/cpus.c +++ b/cpus.c @@ -254,6 +254,10 @@ static void qemu_kvm_init_cpu_signals(CPUState *env) pthread_sigmask(SIG_BLOCK, NULL, &set); sigdelset(&set, SIG_IPI); sigdelset(&set, SIGBUS); +#ifndef CONFIG_IOTHREAD + sigdelset(&set, SIGIO); + sigdelset(&set, SIGALRM); +#endif r = kvm_set_signal_mask(env, &set); if (r) { fprintf(stderr, "kvm_set_signal_mask: %s\n", strerror(-r)); @@ -351,6 +355,12 @@ static void qemu_kvm_eat_signals(CPUState *env) exit(1); } } while (sigismember(&chkset, SIG_IPI) || sigismember(&chkset, SIGBUS)); + +#ifndef CONFIG_IOTHREAD + if (sigismember(&chkset, SIGIO) || sigismember(&chkset, SIGALRM)) { + qemu_notify_event(); + } +#endif } #else /* _WIN32 */ @@ -398,6 +408,14 @@ int qemu_init_main_loop(void) int ret; sigemptyset(&blocked_signals); + if (kvm_enabled()) { + /* + * We need to process timer signals synchronously to avoid a race + * between exit_request check and KVM vcpu entry. + */ + sigaddset(&blocked_signals, SIGIO); + sigaddset(&blocked_signals, SIGALRM); + } ret = qemu_signalfd_init(blocked_signals); if (ret) { @@ -535,6 +553,8 @@ static sigset_t block_io_signals(void) sigaddset(&set, SIGALRM); sigaddset(&set, SIG_IPI); sigaddset(&set, SIGBUS); + sigaddset(&set, SIGIO); + sigaddset(&set, SIGALRM); pthread_sigmask(SIG_BLOCK, &set, NULL); memset(&action, 0, sizeof(action));
Found by Stefan Hajnoczi: There is a race in kvm_cpu_exec between checking for exit_request on vcpu entry and timer signals arriving before KVM starts to catch them. Plug it by blocking both timer related signals also on !CONFIG_IOTHREAD and process those via signalfd. As this fix depends on real signalfd support (otherwise the timer signals only kick the compat helper thread, and the main thread hangs), we need to detect the invalid constellation and abort configure. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> CC: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> --- I don't want to invest that much into !IOTHREAD anymore, so let's see if the proposed catch&abort is acceptable. configure | 6 ++++++ cpus.c | 20 ++++++++++++++++++++ 2 files changed, 26 insertions(+), 0 deletions(-)