diff mbox

[v4,1/5] cpu: Provide vcpu throttling interface

Message ID 1435855010-30882-2-git-send-email-jjherne@linux.vnet.ibm.com
State New
Headers show

Commit Message

Jason J. Herne July 2, 2015, 4:36 p.m. UTC
Provide a method to throttle guest cpu execution. CPUState is augmented with
timeout controls and throttle start/stop functions. To throttle the guest cpu
the caller simply has to call the throttle set function and provide a percentage
of throttle time.

Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
---
 cpus.c            | 66 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 include/qom/cpu.h | 38 ++++++++++++++++++++++++++++++++
 2 files changed, 104 insertions(+)

Comments

Paolo Bonzini July 2, 2015, 4:43 p.m. UTC | #1
On 02/07/2015 18:36, Jason J. Herne wrote:
> +static void cpu_throttle_thread(void *opaque)
> +{
> +    double pct = (double)throttle_percentage/100;
> +    double throttle_ratio = pct / (1 - pct);
> +    long sleeptime_ms = (long)(throttle_ratio * CPU_THROTTLE_TIMESLICE);
> +
> +    if (!throttle_percentage) {
> +        return;
> +    }
> +
> +    qemu_mutex_unlock_iothread();
> +    g_usleep(sleeptime_ms * 1000); /* Convert ms to us for usleep call */
> +    qemu_mutex_lock_iothread();
> +}
> +
> +static void cpu_throttle_timer_tick(void *opaque)
> +{
> +    CPUState *cpu;
> +
> +    /* Stop the timer if needed */
> +    if (!throttle_percentage) {
> +        return;
> +    }
> +    CPU_FOREACH(cpu) {
> +        async_run_on_cpu(cpu, cpu_throttle_thread, NULL);
> +    }
> +
> +    timer_mod(throttle_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_RT) +
> +                                   CPU_THROTTLE_TIMESLICE);
> +}

This could cause callbacks to pile up I think.  David, do you have any
idea how to fix it?

Paolo
Andreas Färber July 2, 2015, 4:47 p.m. UTC | #2
Am 02.07.2015 um 18:36 schrieb Jason J. Herne:
> Provide a method to throttle guest cpu execution. CPUState is augmented with
> timeout controls and throttle start/stop functions. To throttle the guest cpu
> the caller simply has to call the throttle set function and provide a percentage
> of throttle time.
> 
> Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
> Reviewed-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
> ---
>  cpus.c            | 66 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  include/qom/cpu.h | 38 ++++++++++++++++++++++++++++++++
>  2 files changed, 104 insertions(+)

No objections from my side, but the interesting code is outside my area.

I feel we (including myself) are abusing include/qom/cpu.h (here there's
not even a single CPUState argument) but I don't have a better
suggestion. At some point we'll need to revisit the cpu.h vs. cpu-all.h
etc. split or even introduce something new.

I'm preparing a qom-cpu pull and assume this will go through the
migration tree when finalized.

Regards,
Andreas
Dr. David Alan Gilbert July 2, 2015, 4:58 p.m. UTC | #3
* Paolo Bonzini (pbonzini@redhat.com) wrote:
> 
> 
> On 02/07/2015 18:36, Jason J. Herne wrote:
> > +static void cpu_throttle_thread(void *opaque)
> > +{
> > +    double pct = (double)throttle_percentage/100;
> > +    double throttle_ratio = pct / (1 - pct);
> > +    long sleeptime_ms = (long)(throttle_ratio * CPU_THROTTLE_TIMESLICE);
> > +
> > +    if (!throttle_percentage) {
> > +        return;
> > +    }
> > +
> > +    qemu_mutex_unlock_iothread();
> > +    g_usleep(sleeptime_ms * 1000); /* Convert ms to us for usleep call */
> > +    qemu_mutex_lock_iothread();
> > +}
> > +
> > +static void cpu_throttle_timer_tick(void *opaque)
> > +{
> > +    CPUState *cpu;
> > +
> > +    /* Stop the timer if needed */
> > +    if (!throttle_percentage) {
> > +        return;
> > +    }
> > +    CPU_FOREACH(cpu) {
> > +        async_run_on_cpu(cpu, cpu_throttle_thread, NULL);
> > +    }
> > +
> > +    timer_mod(throttle_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_RT) +
> > +                                   CPU_THROTTLE_TIMESLICE);
> > +}
> 
> This could cause callbacks to pile up I think.  David, do you have any
> idea how to fix it?

I don't know the timer code well enough.

Dave
> 
> Paolo
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Jason J. Herne July 13, 2015, 2:43 p.m. UTC | #4
On 07/02/2015 12:43 PM, Paolo Bonzini wrote:
>
>
> On 02/07/2015 18:36, Jason J. Herne wrote:
>> +static void cpu_throttle_thread(void *opaque)
>> +{
>> +    double pct = (double)throttle_percentage/100;
>> +    double throttle_ratio = pct / (1 - pct);
>> +    long sleeptime_ms = (long)(throttle_ratio * CPU_THROTTLE_TIMESLICE);
>> +
>> +    if (!throttle_percentage) {
>> +        return;
>> +    }
>> +
>> +    qemu_mutex_unlock_iothread();
>> +    g_usleep(sleeptime_ms * 1000); /* Convert ms to us for usleep call */
>> +    qemu_mutex_lock_iothread();
>> +}
>> +
>> +static void cpu_throttle_timer_tick(void *opaque)
>> +{
>> +    CPUState *cpu;
>> +
>> +    /* Stop the timer if needed */
>> +    if (!throttle_percentage) {
>> +        return;
>> +    }
>> +    CPU_FOREACH(cpu) {
>> +        async_run_on_cpu(cpu, cpu_throttle_thread, NULL);
>> +    }
>> +
>> +    timer_mod(throttle_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_RT) +
>> +                                   CPU_THROTTLE_TIMESLICE);
>> +}
>
> This could cause callbacks to pile up I think.  David, do you have any
> idea how to fix it?
>
> Paolo
>
>
>

I'm not sure how callbacks can pile up here. If the vcpus are running 
then their thread's will execute the callbacks. If they are not running 
then the use of QEMU_CLOCK_VIRTUAL_RT will prevent the callbacks from 
stacking because the timer is not running, right?
Paolo Bonzini July 13, 2015, 3:14 p.m. UTC | #5
On 13/07/2015 16:43, Jason J. Herne wrote:
>>>
>>> +    CPU_FOREACH(cpu) {
>>> +        async_run_on_cpu(cpu, cpu_throttle_thread, NULL);
>>> +    }
>>> +
>>> +    timer_mod(throttle_timer,
>>> qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_RT) +
>>> +                                   CPU_THROTTLE_TIMESLICE);
>>> +}
>>
>> This could cause callbacks to pile up I think.  David, do you have any
>> idea how to fix it?
> 
> I'm not sure how callbacks can pile up here. If the vcpus are running
> then their thread's will execute the callbacks. If they are not running
> then the use of QEMU_CLOCK_VIRTUAL_RT will prevent the callbacks from
> stacking because the timer is not running, right?

Couldn't the iothread starve the VCPUs?  They need to take the iothread
lock in order to process the callbacks.

Paolo
Jason J. Herne July 15, 2015, 12:40 p.m. UTC | #6
On 07/13/2015 11:14 AM, Paolo Bonzini wrote:
>
>
> On 13/07/2015 16:43, Jason J. Herne wrote:
>>>>
>>>> +    CPU_FOREACH(cpu) {
>>>> +        async_run_on_cpu(cpu, cpu_throttle_thread, NULL);
>>>> +    }
>>>> +
>>>> +    timer_mod(throttle_timer,
>>>> qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_RT) +
>>>> +                                   CPU_THROTTLE_TIMESLICE);
>>>> +}
>>>
>>> This could cause callbacks to pile up I think.  David, do you have any
>>> idea how to fix it?
>>
>> I'm not sure how callbacks can pile up here. If the vcpus are running
>> then their thread's will execute the callbacks. If they are not running
>> then the use of QEMU_CLOCK_VIRTUAL_RT will prevent the callbacks from
>> stacking because the timer is not running, right?
>
> Couldn't the iothread starve the VCPUs?  They need to take the iothread
> lock in order to process the callbacks.
>

Yes, I can see the possibility here. I'm not sure what to do about it 
though.

Maybe this is wishful thinking :) But if the iothread lock cannot be 
acquired then
the cpu cannot run thereby preventing the guest from changing a ton of 
pages.
This will have the effect of indirectly throttling the guest which will 
allow
us to advance to the non-live phase of migration rather quickly. And again,
if we are starving on the iothread lock then the guest vcpus are not 
executing and
QEMU_CLOCK_VIRTUAL_RT is not ticking, right? This will also limit the 
number of
stacked callbacks to a very low number. Unless I've missing something?
Paolo Bonzini July 15, 2015, 12:54 p.m. UTC | #7
On 15/07/2015 14:40, Jason J. Herne wrote:
>>> I'm not sure how callbacks can pile up here. If the vcpus are
>>> running then their thread's will execute the callbacks. If they
>>> are not running then the use of QEMU_CLOCK_VIRTUAL_RT will
>>> prevent the callbacks from stacking because the timer is not
>>> running, right?
>> 
>> Couldn't the iothread starve the VCPUs?  They need to take the
>> iothread lock in order to process the callbacks.
> 
> Yes, I can see the possibility here. I'm not sure what to do about
> it though.
> 
> Maybe this is wishful thinking :) But if the iothread lock cannot be 
> acquired then the cpu cannot run thereby preventing the guest from
> changing a ton of pages. This will have the effect of indirectly
> throttling the guest which will allow us to advance to the non-live
> phase of migration rather quickly.

Makes sense.  On the other hand this wouldn't prevent callbacks from
piling up for a short time because...

> And again, if we are starving on
> the iothread lock then the guest vcpus are not executing and 
> QEMU_CLOCK_VIRTUAL_RT is not ticking, right?

... you are talking about stolen time, and QEMU_CLOCK_VIRTUAL_RT does
count stolen time (stolen time is different for each VCPU, so you would
have a different clock for each VCPU).

QEMU_CLOCK_VIRTUAL and QEMU_CLOCK_VIRTUAL_RT(*) only pause across
stop/cont.  (By the way, the two are the same with KVM).

However, something like

	if (!atomic_xchg(&cpu->throttle_thread_scheduled, 1)) {
	    async_run_on_cpu(cpu, cpu_throttle_thread, NULL);
	}

...
	atomic_set(&cpu->throttle_thread_scheduled, 0);
	g_usleep(...);

should be enough.  You'd still have many timers that could starve the
VCPUs but, as you pointed out, in that case migration would hopefully
finish pretty fast.

Paolo
Jason J. Herne July 16, 2015, 2:21 p.m. UTC | #8
On 07/15/2015 08:54 AM, Paolo Bonzini wrote:
>
>
> On 15/07/2015 14:40, Jason J. Herne wrote:
>>>> I'm not sure how callbacks can pile up here. If the vcpus are
>>>> running then their thread's will execute the callbacks. If they
>>>> are not running then the use of QEMU_CLOCK_VIRTUAL_RT will
>>>> prevent the callbacks from stacking because the timer is not
>>>> running, right?
>>>
>>> Couldn't the iothread starve the VCPUs?  They need to take the
>>> iothread lock in order to process the callbacks.
>>
>> Yes, I can see the possibility here. I'm not sure what to do about
>> it though.
>>
>> Maybe this is wishful thinking :) But if the iothread lock cannot be
>> acquired then the cpu cannot run thereby preventing the guest from
>> changing a ton of pages. This will have the effect of indirectly
>> throttling the guest which will allow us to advance to the non-live
>> phase of migration rather quickly.
>
> Makes sense.  On the other hand this wouldn't prevent callbacks from
> piling up for a short time because...
>
>> And again, if we are starving on
>> the iothread lock then the guest vcpus are not executing and
>> QEMU_CLOCK_VIRTUAL_RT is not ticking, right?
>
> ... you are talking about stolen time, and QEMU_CLOCK_VIRTUAL_RT does
> count stolen time (stolen time is different for each VCPU, so you would
> have a different clock for each VCPU).
>
> QEMU_CLOCK_VIRTUAL and QEMU_CLOCK_VIRTUAL_RT(*) only pause across
> stop/cont.  (By the way, the two are the same with KVM).
>
> However, something like
>
> 	if (!atomic_xchg(&cpu->throttle_thread_scheduled, 1)) {
> 	    async_run_on_cpu(cpu, cpu_throttle_thread, NULL);
> 	}
>
> ...
> 	atomic_set(&cpu->throttle_thread_scheduled, 0);
> 	g_usleep(...);
>
> should be enough.  You'd still have many timers that could starve the
> VCPUs but, as you pointed out, in that case migration would hopefully
> finish pretty fast.
>
> Paolo
>
>

Paolo, Andreas & David, thanks for the review comments.

Has this advanced enough for a reviewed-by? The only remaining 
objections I can find are:

1. Using atomic operations to manage throttle_percentage. I'm not sure 
where atomics are applicable here. If this is still a concern hopefully 
someone can explain.

2. Callback stacking. And it seems like we are convinced that it is not 
a big issue. Anyone disagree?
Paolo Bonzini July 23, 2015, 9:59 a.m. UTC | #9
On 16/07/2015 16:21, Jason J. Herne wrote:
> 1. Using atomic operations to manage throttle_percentage. I'm not sure
> where atomics are applicable here. If this is still a concern hopefully
> someone can explain.

I would use atomic_read/atomic_set in cpu_throttle_set, 
cpu_throttle_stop, cpu_throttle_active, cpu_throttle_get_percentage.  
In addition, the function naming seems to be a bit inconsistent: please 
rename cpu_throttle_set to cpu_throttle_set_percentage.

Second, here:

>> +static void cpu_throttle_thread(void *opaque)
>> +{
>> + double pct = (double)throttle_percentage/100;

Please use cpu_throttle_get_percentage(), and

>> + double throttle_ratio = pct / (1 - pct);
>> + long sleeptime_ms = (long)(throttle_ratio * CPU_THROTTLE_TIMESLICE);

... move these computations below the if.

I'm also not sure about throttle_ratio, why is it needed?  If pct >= 0.5 you
end up with throttle_ratio >= 1, i.e. no way for the CPU to do any work.  This
would definitely cause a problem with callbacks piling up.

>> + if (!throttle_percentage) {
>> + return;
>> + }
>> +
>> + qemu_mutex_unlock_iothread();
>> + g_usleep(sleeptime_ms * 1000); /* Convert ms to us for usleep call */
>> + qemu_mutex_lock_iothread();
>> +}
>> +

> 2. Callback stacking. And it seems like we are convinced that it is not
> a big issue. Anyone disagree?

I think it's not a big issue to have many timers, but it is a big issue to have many callbacks.  What I suggested is this:

    if (!atomic_xchg(&cpu->throttle_thread_scheduled, 1)) {
        async_run_on_cpu(cpu, cpu_throttle_thread, NULL);
    }

and in the callback:

    atomic_set(&cpu->throttle_thread_scheduled, 0);
    g_usleep(...); 

Paolo
Jason J. Herne July 31, 2015, 5:12 p.m. UTC | #10
On 07/23/2015 05:59 AM, Paolo Bonzini wrote:
>
>
> On 16/07/2015 16:21, Jason J. Herne wrote:
>> 1. Using atomic operations to manage throttle_percentage. I'm not sure
>> where atomics are applicable here. If this is still a concern hopefully
>> someone can explain.
>
> I would use atomic_read/atomic_set in cpu_throttle_set,
> cpu_throttle_stop, cpu_throttle_active, cpu_throttle_get_percentage.
> In addition, the function naming seems to be a bit inconsistent: please
> rename cpu_throttle_set to cpu_throttle_set_percentage.
>
> Second, here:
>
>>> +static void cpu_throttle_thread(void *opaque)
>>> +{
>>> + double pct = (double)throttle_percentage/100;
>
> Please use cpu_throttle_get_percentage(), and
>
>>> + double throttle_ratio = pct / (1 - pct);
>>> + long sleeptime_ms = (long)(throttle_ratio * CPU_THROTTLE_TIMESLICE);
>
> ... move these computations below the if.
>
> I'm also not sure about throttle_ratio, why is it needed?  If pct >= 0.5 you
> end up with throttle_ratio >= 1, i.e. no way for the CPU to do any work.  This
> would definitely cause a problem with callbacks piling up.
>

Throttle ratio is relative to CPU_THROTTLE_TIMESLICE. Take a look at how
throttle_ratio is used in the calculation:

long sleeptime_ms = (long)(throttle_ratio * CPU_THROTTLE_TIMESLICE);

A value of 1 means we sleep the same amount of time that we execute.

>>> + if (!throttle_percentage) {
>>> + return;
>>> + }
>>> +
>>> + qemu_mutex_unlock_iothread();
>>> + g_usleep(sleeptime_ms * 1000); /* Convert ms to us for usleep call */
>>> + qemu_mutex_lock_iothread();
>>> +}
>>> +
>
>> 2. Callback stacking. And it seems like we are convinced that it is not
>> a big issue. Anyone disagree?
>
> I think it's not a big issue to have many timers, but it is a big issue to have many callbacks.  What I suggested is this:
>
>      if (!atomic_xchg(&cpu->throttle_thread_scheduled, 1)) {
>          async_run_on_cpu(cpu, cpu_throttle_thread, NULL);
>      }
>
> and in the callback:
>
>      atomic_set(&cpu->throttle_thread_scheduled, 0);
>      g_usleep(...);
>
> Paolo
>
>
Paolo Bonzini July 31, 2015, 5:16 p.m. UTC | #11
On 31/07/2015 19:12, Jason J. Herne wrote:
>>
>>
> 
> Throttle ratio is relative to CPU_THROTTLE_TIMESLICE. Take a look at how
> throttle_ratio is used in the calculation:
> 
> long sleeptime_ms = (long)(throttle_ratio * CPU_THROTTLE_TIMESLICE);
> 
> A value of 1 means we sleep the same amount of time that we execute.

But that doesn't work if your timer runs every CPU_THROTTLE_TIMESLICE
milliseconds, and thus schedules async work every CPU_THROTTLE_TIMESLICE
milliseconds.

The timer would have to be scheduled every (throttle_ratio + 1) *
CPU_THROTTLE_TIMESLICE milliseconds, i.e. CPU_THROTTLE_TIMESLICE /
(1-pct) milliseconds.

Paolo
Jason J. Herne July 31, 2015, 5:42 p.m. UTC | #12
On 07/31/2015 01:16 PM, Paolo Bonzini wrote:
>
>
> On 31/07/2015 19:12, Jason J. Herne wrote:
>>>
>>>
>>
>> Throttle ratio is relative to CPU_THROTTLE_TIMESLICE. Take a look at how
>> throttle_ratio is used in the calculation:
>>
>> long sleeptime_ms = (long)(throttle_ratio * CPU_THROTTLE_TIMESLICE);
>>
>> A value of 1 means we sleep the same amount of time that we execute.
>
> But that doesn't work if your timer runs every CPU_THROTTLE_TIMESLICE
> milliseconds, and thus schedules async work every CPU_THROTTLE_TIMESLICE
> milliseconds.
>
> The timer would have to be scheduled every (throttle_ratio + 1) *
> CPU_THROTTLE_TIMESLICE milliseconds, i.e. CPU_THROTTLE_TIMESLICE /
> (1-pct) milliseconds.
>
> Paolo

Doh! Yep :). This problem is an artifact of moving the timer_mod from
cpu_throttle_thread into cpu_throttle_timer_tick. I'll have to go back
to the review comments and look at why that was done.
Jason J. Herne July 31, 2015, 6:11 p.m. UTC | #13
On 07/31/2015 01:42 PM, Jason J. Herne wrote:
> On 07/31/2015 01:16 PM, Paolo Bonzini wrote:
>>
>>
>> On 31/07/2015 19:12, Jason J. Herne wrote:
>>>>
>>>>
>>>
>>> Throttle ratio is relative to CPU_THROTTLE_TIMESLICE. Take a look at how
>>> throttle_ratio is used in the calculation:
>>>
>>> long sleeptime_ms = (long)(throttle_ratio * CPU_THROTTLE_TIMESLICE);
>>>
>>> A value of 1 means we sleep the same amount of time that we execute.
>>
>> But that doesn't work if your timer runs every CPU_THROTTLE_TIMESLICE
>> milliseconds, and thus schedules async work every CPU_THROTTLE_TIMESLICE
>> milliseconds.
>>
>> The timer would have to be scheduled every (throttle_ratio + 1) *
>> CPU_THROTTLE_TIMESLICE milliseconds, i.e. CPU_THROTTLE_TIMESLICE /
>> (1-pct) milliseconds.
>>
>> Paolo
>
> Doh! Yep :). This problem is an artifact of moving the timer_mod from
> cpu_throttle_thread into cpu_throttle_timer_tick. I'll have to go back
> to the review comments and look at why that was done.

So, we made that change in v3 to eliminate the per cpu timer. With a per
cpu timer we avoid this problem and we no longer need to worry about
a throttle_thread_scheduled, and timers stacking. Paolo, you had originally
argued in favor of this change. With what we know now, do you still think
having only a single timer is best? Or should I switch back to a timer per
cpu? With a timer per cpu we can simply reset the timer immediately after
the sleep.

I guess an alternative would be for the last cpu to complete its sleep to
reset the timer in cpu_throttle_thread. We would need an atomic flag in
CPUState and a loop to run it and bail out if any cpu has the flag set.
Paolo Bonzini Aug. 1, 2015, 9:40 a.m. UTC | #14
On 31/07/2015 20:11, Jason J. Herne wrote:
>>>
>>
>> Doh! Yep :). This problem is an artifact of moving the timer_mod from
>> cpu_throttle_thread into cpu_throttle_timer_tick. I'll have to go back
>> to the review comments and look at why that was done.
> 
> So, we made that change in v3 to eliminate the per cpu timer. With a per
> cpu timer we avoid this problem and we no longer need to worry about
> a throttle_thread_scheduled, and timers stacking. Paolo, you had originally
> argued in favor of this change. With what we know now, do you still think
> having only a single timer is best? Or should I switch back to a timer per
> cpu? With a timer per cpu we can simply reset the timer immediately after
> the sleep.

It's okay to have a single timer, only the formulas have to be
corrected: either you remove the pct/(1-pct) from the callback or you
add a /(1-pct) to the timer_mod.

Paolo
Jason J. Herne Sept. 1, 2015, 2:43 p.m. UTC | #15
On 08/01/2015 05:40 AM, Paolo Bonzini wrote:
>
>
> On 31/07/2015 20:11, Jason J. Herne wrote:
>>>>
>>>
>>> Doh! Yep :). This problem is an artifact of moving the timer_mod from
>>> cpu_throttle_thread into cpu_throttle_timer_tick. I'll have to go back
>>> to the review comments and look at why that was done.
>>
>> So, we made that change in v3 to eliminate the per cpu timer. With a per
>> cpu timer we avoid this problem and we no longer need to worry about
>> a throttle_thread_scheduled, and timers stacking. Paolo, you had originally
>> argued in favor of this change. With what we know now, do you still think
>> having only a single timer is best? Or should I switch back to a timer per
>> cpu? With a timer per cpu we can simply reset the timer immediately after
>> the sleep.
>
> It's okay to have a single timer, only the formulas have to be
> corrected: either you remove the pct/(1-pct) from the callback or you
> add a /(1-pct) to the timer_mod.
>
> Paolo

Paolo,

You are correct here. I've adjusted the timer formula and tested. 
Everything seems to be playing nicely now. Sorry it took me a month to 
get to this. I got pulled into some critical work and improved 
auto-converge took a back seat. I know it is a pain to go back to 
something you have not seen in a month so I appreciate any attention 
this gets :). A new patch set will be inbound shortly...
diff mbox

Patch

diff --git a/cpus.c b/cpus.c
index de6469f..6f86da0 100644
--- a/cpus.c
+++ b/cpus.c
@@ -68,6 +68,14 @@  static CPUState *next_cpu;
 int64_t max_delay;
 int64_t max_advance;
 
+/* vcpu throttling controls */
+static QEMUTimer *throttle_timer;
+static unsigned int throttle_percentage;
+
+#define CPU_THROTTLE_PCT_MIN 1
+#define CPU_THROTTLE_PCT_MAX 99
+#define CPU_THROTTLE_TIMESLICE 10
+
 bool cpu_is_stopped(CPUState *cpu)
 {
     return cpu->stopped || !runstate_is_running();
@@ -486,10 +494,68 @@  static const VMStateDescription vmstate_timers = {
     }
 };
 
+static void cpu_throttle_thread(void *opaque)
+{
+    double pct = (double)throttle_percentage/100;
+    double throttle_ratio = pct / (1 - pct);
+    long sleeptime_ms = (long)(throttle_ratio * CPU_THROTTLE_TIMESLICE);
+
+    if (!throttle_percentage) {
+        return;
+    }
+
+    qemu_mutex_unlock_iothread();
+    g_usleep(sleeptime_ms * 1000); /* Convert ms to us for usleep call */
+    qemu_mutex_lock_iothread();
+}
+
+static void cpu_throttle_timer_tick(void *opaque)
+{
+    CPUState *cpu;
+
+    /* Stop the timer if needed */
+    if (!throttle_percentage) {
+        return;
+    }
+    CPU_FOREACH(cpu) {
+        async_run_on_cpu(cpu, cpu_throttle_thread, NULL);
+    }
+
+    timer_mod(throttle_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_RT) +
+                                   CPU_THROTTLE_TIMESLICE);
+}
+
+void cpu_throttle_set(int new_throttle_pct)
+{
+    /* Ensure throttle percentage is within valid range */
+    new_throttle_pct = MIN(new_throttle_pct, CPU_THROTTLE_PCT_MAX);
+    throttle_percentage = MAX(new_throttle_pct, CPU_THROTTLE_PCT_MIN);
+
+    timer_mod(throttle_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_RT) +
+                                       CPU_THROTTLE_TIMESLICE);
+}
+
+void cpu_throttle_stop(void)
+{
+    throttle_percentage = 0;
+}
+
+bool cpu_throttle_active(void)
+{
+    return (throttle_percentage != 0);
+}
+
+int cpu_throttle_get_percentage(void)
+{
+    return throttle_percentage;
+}
+
 void cpu_ticks_init(void)
 {
     seqlock_init(&timers_state.vm_clock_seqlock, NULL);
     vmstate_register(NULL, 0, &vmstate_timers, &timers_state);
+    throttle_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL_RT,
+                                           cpu_throttle_timer_tick, NULL);
 }
 
 void configure_icount(QemuOpts *opts, Error **errp)
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 39f0f19..56eb964 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -553,6 +553,44 @@  CPUState *qemu_get_cpu(int index);
  */
 bool cpu_exists(int64_t id);
 
+/**
+ * cpu_throttle_set:
+ * @new_throttle_pct: Percent of sleep time to running time.
+ *                    Valid range is 1 to 99.
+ *
+ * Throttles all vcpus by forcing them to sleep for the given percentage of
+ * time. A throttle_percentage of 50 corresponds to a 50% duty cycle roughly.
+ * (example: 10ms sleep for every 10ms awake).
+ *
+ * cpu_throttle_set can be called as needed to adjust new_throttle_pct.
+ * Once the throttling starts, it will remain in effect until cpu_throttle_stop
+ * is called.
+ */
+void cpu_throttle_set(int new_throttle_pct);
+
+/**
+ * cpu_throttle_stop:
+ *
+ * Stops the vcpu throttling started by cpu_throttle_set.
+ */
+void cpu_throttle_stop(void);
+
+/**
+ * cpu_throttle_active:
+ *
+ * Returns %true if the vcpus are currently being throttled, %false otherwise.
+ */
+bool cpu_throttle_active(void);
+
+/**
+ * cpu_throttle_get_percentage:
+ *
+ * Returns the vcpu throttle percentage. See cpu_throttle_set for details.
+ *
+ * Returns The throttle percentage in range 1 to 99.
+ */
+int cpu_throttle_get_percentage(void);
+
 #ifndef CONFIG_USER_ONLY
 
 typedef void (*CPUInterruptHandler)(CPUState *, int);