Message ID | 1381222058-16701-6-git-send-email-pbonzini@redhat.com |
---|---|
State | New |
Headers | show |
On 8 Oct 2013, at 09:47, Paolo Bonzini wrote: > Computing the deadline of all vm_clocks is somewhat expensive and calls > out to qemu-timer.c; two reasons not to do it in the seqlock's write-side > critical section. This however opens the door for races in setting and > reading vm_clock_warp_start. > > To plug them, we need to cover the case where a new deadline slips in > between the call to qemu_clock_deadline_ns_all and the actual modification > of the icount_warp_timer. Restrict changes to vm_clock_warp_start and > the icount_warp_timer's expiration time, to only move them back (which > would simply cause an early wakeup). > > If a vm_clock timer is cancelled while CPUs are idle, this might cause the > icount_warp_timer to fire unnecessarily. This is not a problem, after it > fires the timer becomes inactive and the next call to timer_mod_anticipate > will be precise. > > In addition to this, we must deactivate the icount_warp_timer _before_ > checking whether CPUs are idle. This way, if the "last" CPU becomes idle > during the call to timer_del we will still set up the icount_warp_timer. > > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > --- > cpus.c | 19 +++++++++---------- > 1 file changed, 9 insertions(+), 10 deletions(-) > > diff --git a/cpus.c b/cpus.c > index 9f450ad..08eaf23 100644 > --- a/cpus.c > +++ b/cpus.c > @@ -319,6 +319,7 @@ void qtest_clock_warp(int64_t dest) > > void qemu_clock_warp(QEMUClockType type) > { > + int64_t clock; > int64_t deadline; > > /* > @@ -338,7 +339,7 @@ void qemu_clock_warp(QEMUClockType type) > * the earliest QEMU_CLOCK_VIRTUAL timer. > */ > icount_warp_rt(NULL); > - if (!all_cpu_threads_idle() || !qemu_clock_has_timers(QEMU_CLOCK_VIRTUAL)) { > - timer_del(icount_warp_timer); > + timer_del(icount_warp_timer); > + if (!all_cpu_threads_idle()) { > return; > } > @@ -348,17 +349,11 @@ void qemu_clock_warp(QEMUClockType type) > return; > } > > - vm_clock_warp_start = qemu_clock_get_ns(QEMU_CLOCK_REALTIME); > /* We want to use the earliest deadline from ALL vm_clocks */ > + clock = qemu_clock_get_ns(QEMU_CLOCK_REALTIME); > deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL); > - > - /* Maintain prior (possibly buggy) behaviour where if no deadline > - * was set (as there is no QEMU_CLOCK_VIRTUAL timer) or it is more than > - * INT32_MAX nanoseconds ahead, we still use INT32_MAX > - * nanoseconds. > - */ > - if ((deadline < 0) || (deadline > INT32_MAX)) { > - deadline = INT32_MAX; > + if (deadline < 0) { > + return; > } Arguably the patch could document why removing the check for deadline > INT32_MAX (the bug for bug compatibility) is safe, as I couldn't entirely convince myself it was, mostly because I couldn't see why it was doing it in the first place. > > if (deadline > 0) { > @@ -379,7 +375,10 @@ void qemu_clock_warp(QEMUClockType type) > * you will not be sending network packets continuously instead of > * every 100ms. > */ > - timer_mod(icount_warp_timer, vm_clock_warp_start + deadline); > + if (vm_clock_warp_start == -1 || vm_clock_warp_start > clock) { > + vm_clock_warp_start = clock; > + } > + timer_mod_anticipate(icount_warp_timer, clock + deadline); > } else if (deadline == 0) { > qemu_clock_notify(QEMU_CLOCK_VIRTUAL); > } > -- > 1.8.3.1 > > > >
Il 08/10/2013 18:54, Alex Bligh ha scritto: >> > - >> > - /* Maintain prior (possibly buggy) behaviour where if no deadline >> > - * was set (as there is no QEMU_CLOCK_VIRTUAL timer) or it is more than >> > - * INT32_MAX nanoseconds ahead, we still use INT32_MAX >> > - * nanoseconds. >> > - */ >> > - if ((deadline < 0) || (deadline > INT32_MAX)) { >> > - deadline = INT32_MAX; >> > + if (deadline < 0) { >> > + return; >> > } > > Arguably the patch could document why removing the check for deadline > INT32_MAX > (the bug for bug compatibility) is safe, as I couldn't entirely convince myself it > was, mostly because I couldn't see why it was doing it in the first place. I couldn't convince myself that it is _not_ safe :) and it made the code more complicated. As soon as a deadline appears, qemu_clock_warp() will be called again and update the icount_warp_timer. Ok to move that to a separate patch? Paolo
On 8 Oct 2013, at 17:56, Paolo Bonzini wrote: >> Arguably the patch could document why removing the check for deadline > INT32_MAX >> (the bug for bug compatibility) is safe, as I couldn't entirely convince myself it >> was, mostly because I couldn't see why it was doing it in the first place. > > I couldn't convince myself that it is _not_ safe :) and it made the code > more complicated. As soon as a deadline appears, qemu_clock_warp() will > be called again and update the icount_warp_timer. > > Ok to move that to a separate patch? To be honest I put it in out of an abundance of caution. I am very tempted to take it out and see what breaks. As far as I can see all the time arithmetic is not int64_t; perhaps this was not always the case. I was more checking you hadn't removed it by accident. Perhaps just add "special casing deadlines > INT32_MAX removed as all arithmetic now 64 bit". There is another offender in tcg_cpu_exec. deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL); /* Maintain prior (possibly buggy) behaviour where if no deadline * was set (as there is no QEMU_CLOCK_VIRTUAL timer) or it is more than * INT32_MAX nanoseconds ahead, we still use INT32_MAX * nanoseconds. */ if ((deadline < 0) || (deadline > INT32_MAX)) { deadline = INT32_MAX; } count = qemu_icount_round(deadline); qemu_icount += count; decr = (count > 0xffff) ? 0xffff : count; count -= decr; env->icount_decr.u16.low = decr; env->icount_extra = count; This implies that qemu_icount_round() cannot take a 64 bit int. static int64_t qemu_icount_round(int64_t count) { return (count + (1 << icount_time_shift) - 1) >> icount_time_shift; } I would have thought it better if qemu_icount_round just dealt sensibly with negative parameters.
Il 08/10/2013 19:08, Alex Bligh ha scritto: > > On 8 Oct 2013, at 17:56, Paolo Bonzini wrote: > >>> Arguably the patch could document why removing the check for deadline > INT32_MAX >>> (the bug for bug compatibility) is safe, as I couldn't entirely convince myself it >>> was, mostly because I couldn't see why it was doing it in the first place. >> >> I couldn't convince myself that it is _not_ safe :) and it made the code >> more complicated. As soon as a deadline appears, qemu_clock_warp() will >> be called again and update the icount_warp_timer. >> >> Ok to move that to a separate patch? > > To be honest I put it in out of an abundance of caution. I am very > tempted to take it out and see what breaks. As far as I can see all > the time arithmetic is not int64_t; perhaps this was not always the > case. I was more checking you hadn't removed it by accident. Perhaps > just add "special casing deadlines > INT32_MAX removed as all > arithmetic now 64 bit". > > There is another offender in tcg_cpu_exec. > > deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL); > > /* Maintain prior (possibly buggy) behaviour where if no deadline > * was set (as there is no QEMU_CLOCK_VIRTUAL timer) or it is more than > * INT32_MAX nanoseconds ahead, we still use INT32_MAX > * nanoseconds. > */ > if ((deadline < 0) || (deadline > INT32_MAX)) { > deadline = INT32_MAX; > } > > count = qemu_icount_round(deadline); > qemu_icount += count; > decr = (count > 0xffff) ? 0xffff : count; > count -= decr; > env->icount_decr.u16.low = decr; > env->icount_extra = count; > > This implies that qemu_icount_round() cannot take a 64 bit int. > > static int64_t qemu_icount_round(int64_t count) > { > return (count + (1 << icount_time_shift) - 1) >> icount_time_shift; > } > > I would have thought it better if qemu_icount_round just > dealt sensibly with negative parameters. > I'll clean that up separately. Thanks, Paolo
diff --git a/cpus.c b/cpus.c index 9f450ad..08eaf23 100644 --- a/cpus.c +++ b/cpus.c @@ -319,6 +319,7 @@ void qtest_clock_warp(int64_t dest) void qemu_clock_warp(QEMUClockType type) { + int64_t clock; int64_t deadline; /* @@ -338,7 +339,7 @@ void qemu_clock_warp(QEMUClockType type) * the earliest QEMU_CLOCK_VIRTUAL timer. */ icount_warp_rt(NULL); - if (!all_cpu_threads_idle() || !qemu_clock_has_timers(QEMU_CLOCK_VIRTUAL)) { - timer_del(icount_warp_timer); + timer_del(icount_warp_timer); + if (!all_cpu_threads_idle()) { return; } @@ -348,17 +349,11 @@ void qemu_clock_warp(QEMUClockType type) return; } - vm_clock_warp_start = qemu_clock_get_ns(QEMU_CLOCK_REALTIME); /* We want to use the earliest deadline from ALL vm_clocks */ + clock = qemu_clock_get_ns(QEMU_CLOCK_REALTIME); deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL); - - /* Maintain prior (possibly buggy) behaviour where if no deadline - * was set (as there is no QEMU_CLOCK_VIRTUAL timer) or it is more than - * INT32_MAX nanoseconds ahead, we still use INT32_MAX - * nanoseconds. - */ - if ((deadline < 0) || (deadline > INT32_MAX)) { - deadline = INT32_MAX; + if (deadline < 0) { + return; } if (deadline > 0) { @@ -379,7 +375,10 @@ void qemu_clock_warp(QEMUClockType type) * you will not be sending network packets continuously instead of * every 100ms. */ - timer_mod(icount_warp_timer, vm_clock_warp_start + deadline); + if (vm_clock_warp_start == -1 || vm_clock_warp_start > clock) { + vm_clock_warp_start = clock; + } + timer_mod_anticipate(icount_warp_timer, clock + deadline); } else if (deadline == 0) { qemu_clock_notify(QEMU_CLOCK_VIRTUAL); }
Computing the deadline of all vm_clocks is somewhat expensive and calls out to qemu-timer.c; two reasons not to do it in the seqlock's write-side critical section. This however opens the door for races in setting and reading vm_clock_warp_start. To plug them, we need to cover the case where a new deadline slips in between the call to qemu_clock_deadline_ns_all and the actual modification of the icount_warp_timer. Restrict changes to vm_clock_warp_start and the icount_warp_timer's expiration time, to only move them back (which would simply cause an early wakeup). If a vm_clock timer is cancelled while CPUs are idle, this might cause the icount_warp_timer to fire unnecessarily. This is not a problem, after it fires the timer becomes inactive and the next call to timer_mod_anticipate will be precise. In addition to this, we must deactivate the icount_warp_timer _before_ checking whether CPUs are idle. This way, if the "last" CPU becomes idle during the call to timer_del we will still set up the icount_warp_timer. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- cpus.c | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-)