diff mbox

[1/3] blockjob: Introduce block_job_relax_cpu

Message ID 1436413678-7114-2-git-send-email-famz@redhat.com
State New
Headers show

Commit Message

Fam Zheng July 9, 2015, 3:47 a.m. UTC
block_job_sleep_ns is called by block job coroutines to yield the
execution to VCPU threads and monitor etc. It is pointless to sleep for
0 or a few nanoseconds, because that equals to a "yield + enter" with no
intermission in between (the timer fires immediately in the same
iteration of event loop), which means other code still doesn't get a
fair share of main loop / BQL.

Introduce block_job_relax_cpu which will at least for
BLOCK_JOB_RELAX_CPU_NS. Existing block_job_sleep_ns(job, 0) callers can
be replaced by this later.

Reported-by: Alexandre DERUMIER <aderumier@odiso.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
---
 include/block/blockjob.h | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

Comments

Stefan Hajnoczi July 9, 2015, 12:54 p.m. UTC | #1
On Thu, Jul 09, 2015 at 11:47:56AM +0800, Fam Zheng wrote:
> block_job_sleep_ns is called by block job coroutines to yield the
> execution to VCPU threads and monitor etc. It is pointless to sleep for
> 0 or a few nanoseconds, because that equals to a "yield + enter" with no
> intermission in between (the timer fires immediately in the same
> iteration of event loop), which means other code still doesn't get a
> fair share of main loop / BQL.
> 
> Introduce block_job_relax_cpu which will at least for
> BLOCK_JOB_RELAX_CPU_NS. Existing block_job_sleep_ns(job, 0) callers can
> be replaced by this later.
> 
> Reported-by: Alexandre DERUMIER <aderumier@odiso.com>
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  include/block/blockjob.h | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/include/block/blockjob.h b/include/block/blockjob.h
> index 57d8ef1..53ac4f4 100644
> --- a/include/block/blockjob.h
> +++ b/include/block/blockjob.h
> @@ -157,6 +157,22 @@ void *block_job_create(const BlockJobDriver *driver, BlockDriverState *bs,
>   */
>  void block_job_sleep_ns(BlockJob *job, QEMUClockType type, int64_t ns);
>  
> +#define BLOCK_JOB_RELAX_CPU_NS 10000000L

By the way, why did you choose 10 milliseconds?  That is quite long.

If this function is called once per 10 ms disk I/O operations then we
lose 50% utilization.  1 ms or less would be reasonable.

> +
> +/**
> + * block_job_relax_cpu:
> + * @job: The job that calls the function.
> + *
> + * Sleep a little to avoid intensive cpu time occupation. Block jobs should
> + * call this or block_job_sleep_ns (for more precision, but note that 0 ns is
> + * usually not enought) periodically, otherwise the QMP and VCPU could starve

s/enought/enough/

> + * on CPU and/or BQL.
> + */
> +static inline void block_job_relax_cpu(BlockJob *job)

coroutine_fn is missing.
Alexandre DERUMIER July 10, 2015, 3:42 a.m. UTC | #2
Hi Stefan,

>>By the way, why did you choose 10 milliseconds?  That is quite long.
>>
>>If this function is called once per 10 ms disk I/O operations then we
>>lose 50% utilization.  1 ms or less would be reasonable.

From my tests, 1ms is not enough, It still hanging in guest or qmp queries.
10ms give me optimal balance between bitmap scan speed and guest responsiveness.

I don't known if this can be compute automaticaly ? (maybe it's depend of disk lseek speed and cpu speed).


----- Mail original -----
De: "Stefan Hajnoczi" <stefanha@gmail.com>
À: "Fam Zheng" <famz@redhat.com>
Cc: "Kevin Wolf" <kwolf@redhat.com>, "qemu-devel" <qemu-devel@nongnu.org>, qemu-block@nongnu.org
Envoyé: Jeudi 9 Juillet 2015 14:54:55
Objet: Re: [Qemu-devel] [Qemu-block] [PATCH 1/3] blockjob: Introduce	block_job_relax_cpu

On Thu, Jul 09, 2015 at 11:47:56AM +0800, Fam Zheng wrote: 
> block_job_sleep_ns is called by block job coroutines to yield the 
> execution to VCPU threads and monitor etc. It is pointless to sleep for 
> 0 or a few nanoseconds, because that equals to a "yield + enter" with no 
> intermission in between (the timer fires immediately in the same 
> iteration of event loop), which means other code still doesn't get a 
> fair share of main loop / BQL. 
> 
> Introduce block_job_relax_cpu which will at least for 
> BLOCK_JOB_RELAX_CPU_NS. Existing block_job_sleep_ns(job, 0) callers can 
> be replaced by this later. 
> 
> Reported-by: Alexandre DERUMIER <aderumier@odiso.com> 
> Signed-off-by: Fam Zheng <famz@redhat.com> 
> --- 
> include/block/blockjob.h | 16 ++++++++++++++++ 
> 1 file changed, 16 insertions(+) 
> 
> diff --git a/include/block/blockjob.h b/include/block/blockjob.h 
> index 57d8ef1..53ac4f4 100644 
> --- a/include/block/blockjob.h 
> +++ b/include/block/blockjob.h 
> @@ -157,6 +157,22 @@ void *block_job_create(const BlockJobDriver *driver, BlockDriverState *bs, 
> */ 
> void block_job_sleep_ns(BlockJob *job, QEMUClockType type, int64_t ns); 
> 
> +#define BLOCK_JOB_RELAX_CPU_NS 10000000L 

By the way, why did you choose 10 milliseconds? That is quite long. 

If this function is called once per 10 ms disk I/O operations then we 
lose 50% utilization. 1 ms or less would be reasonable. 

> + 
> +/** 
> + * block_job_relax_cpu: 
> + * @job: The job that calls the function. 
> + * 
> + * Sleep a little to avoid intensive cpu time occupation. Block jobs should 
> + * call this or block_job_sleep_ns (for more precision, but note that 0 ns is 
> + * usually not enought) periodically, otherwise the QMP and VCPU could starve 

s/enought/enough/ 

> + * on CPU and/or BQL. 
> + */ 
> +static inline void block_job_relax_cpu(BlockJob *job) 

coroutine_fn is missing.
Stefan Hajnoczi July 14, 2015, 12:31 p.m. UTC | #3
On Fri, Jul 10, 2015 at 05:42:48AM +0200, Alexandre DERUMIER wrote:
> >>By the way, why did you choose 10 milliseconds?  That is quite long.
> >>
> >>If this function is called once per 10 ms disk I/O operations then we
> >>lose 50% utilization.  1 ms or less would be reasonable.
> 
> From my tests, 1ms is not enough, It still hanging in guest or qmp queries.
> 10ms give me optimal balance between bitmap scan speed and guest responsiveness.

Then I don't fully understand the bug.

Fam: can you explain why 1ms isn't enough?
Fam Zheng July 15, 2015, 10:32 a.m. UTC | #4
On Tue, 07/14 13:31, Stefan Hajnoczi wrote:
> On Fri, Jul 10, 2015 at 05:42:48AM +0200, Alexandre DERUMIER wrote:
> > >>By the way, why did you choose 10 milliseconds?  That is quite long.
> > >>
> > >>If this function is called once per 10 ms disk I/O operations then we
> > >>lose 50% utilization.  1 ms or less would be reasonable.
> > 
> > From my tests, 1ms is not enough, It still hanging in guest or qmp queries.
> > 10ms give me optimal balance between bitmap scan speed and guest responsiveness.
> 
> Then I don't fully understand the bug.
> 
> Fam: can you explain why 1ms isn't enough?

In Alexandre's case, I suppose it's because the lseek is so slow that sleeping
for 1ms would still let mirror coroutine to occupy, say, 90% of CPU time, so
guest IO stutters. Perhaps we could move lseek to thread pool in the future.

Anyway, 10ms wasn't a deliberate choice, because I didn't have one. I agree in
other cases, 1ms or less should be enough.

Fam
Stefan Hajnoczi July 16, 2015, 1:21 p.m. UTC | #5
On Wed, Jul 15, 2015 at 06:32:54PM +0800, Fam Zheng wrote:
> On Tue, 07/14 13:31, Stefan Hajnoczi wrote:
> > On Fri, Jul 10, 2015 at 05:42:48AM +0200, Alexandre DERUMIER wrote:
> > > >>By the way, why did you choose 10 milliseconds?  That is quite long.
> > > >>
> > > >>If this function is called once per 10 ms disk I/O operations then we
> > > >>lose 50% utilization.  1 ms or less would be reasonable.
> > > 
> > > From my tests, 1ms is not enough, It still hanging in guest or qmp queries.
> > > 10ms give me optimal balance between bitmap scan speed and guest responsiveness.
> > 
> > Then I don't fully understand the bug.
> > 
> > Fam: can you explain why 1ms isn't enough?
> 
> In Alexandre's case, I suppose it's because the lseek is so slow that sleeping
> for 1ms would still let mirror coroutine to occupy, say, 90% of CPU time, so
> guest IO stutters. Perhaps we could move lseek to thread pool in the future.

Right, that's the real problem here.  If lseek is done in a worker
thread than the coroutine yields in the meantime and the responsiveness
problem is solved.

This sounds like an important fix in the early 2.5 release cycle.
diff mbox

Patch

diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index 57d8ef1..53ac4f4 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -157,6 +157,22 @@  void *block_job_create(const BlockJobDriver *driver, BlockDriverState *bs,
  */
 void block_job_sleep_ns(BlockJob *job, QEMUClockType type, int64_t ns);
 
+#define BLOCK_JOB_RELAX_CPU_NS 10000000L
+
+/**
+ * block_job_relax_cpu:
+ * @job: The job that calls the function.
+ *
+ * Sleep a little to avoid intensive cpu time occupation. Block jobs should
+ * call this or block_job_sleep_ns (for more precision, but note that 0 ns is
+ * usually not enought) periodically, otherwise the QMP and VCPU could starve
+ * on CPU and/or BQL.
+ */
+static inline void block_job_relax_cpu(BlockJob *job)
+{
+    return block_job_sleep_ns(job, QEMU_CLOCK_REALTIME, BLOCK_JOB_RELAX_CPU_NS);
+}
+
 /**
  * block_job_yield:
  * @job: The job that calls the function.