Message ID | 1465928228-1184-3-git-send-email-stefanha@redhat.com |
---|---|
State | New |
Headers | show |
On 14/06/2016 20:17, Stefan Hajnoczi wrote: > Block jobs are coroutines that usually perform I/O but sometimes also > sleep or yield. Currently only sleeping or yielded block jobs can be > paused. This means jobs that do not sleep or yield (using > block_job_yield()) are unaffected by block_job_pause(). > > Add block_job_pause_point() so that block jobs can mark quiescent points > that are suitable for pausing. This solves the problem that it can take > a block job a long time to pause if it is performing a long series of > I/O operations. > > Transitioning to paused state involves a .pause()/.resume() callback. > These callbacks are used to ensure that I/O and event loop activity has > ceased while the job is at a pause point. > > Note that this patch introduces a stricter pause state than previously. > The job->busy flag was incorrectly documented as a quiescent state > without I/O pending. This is violated by any job that has I/O pending > across sleep or block_job_yield(), like the mirror block job. Right, we should document job->busy as a quiescent state where no one will re-enter the coroutine. Paolo
On Tue, 06/14 19:17, Stefan Hajnoczi wrote: > --- a/blockjob.c > +++ b/blockjob.c > @@ -247,6 +247,30 @@ void block_job_complete(BlockJob *job, Error **errp) > job->driver->complete(job, errp); > } > > +void block_job_pause_point(BlockJob *job) > +{ > + if (!block_job_is_paused(job)) { I find this check ... > + return; > + } > + if (block_job_is_cancelled(job)) { > + return; > + } > + > + if (job->driver->pause) { > + job->driver->pause(job); > + } > + > + job->paused = true; ... and this assignment confusing. After reading more, I think we ought to rename block_job_is_paused to block_job_should_pause and mark it static, in a separate patch. > + job->busy = false; > + qemu_coroutine_yield(); /* wait for block_job_resume() */ > + job->busy = true; > + job->paused = false; Worth to "assert(!job->pause_count)" (or "assert(!block_job_should_pause(job))")? Regardless, Reviewed-by: Fam Zheng <famz@redhat.com> > + > + if (job->driver->resume) { > + job->driver->resume(job); > + } > +} > +
On 15/06/2016 10:57, Fam Zheng wrote: > > + if (!block_job_is_paused(job)) { > > I find this check ... > > > + return; > > + } > > + if (block_job_is_cancelled(job)) { > > + return; > > + } > > + > > + if (job->driver->pause) { > > + job->driver->pause(job); > > + } > > + > > + job->paused = true; > > ... and this assignment confusing. After reading more, I think we ought to > rename block_job_is_paused to block_job_should_pause and mark it static, in a > separate patch. Very good idea! Paolo
On Wed, Jun 15, 2016 at 04:57:41PM +0800, Fam Zheng wrote: > On Tue, 06/14 19:17, Stefan Hajnoczi wrote: > > --- a/blockjob.c > > +++ b/blockjob.c > > @@ -247,6 +247,30 @@ void block_job_complete(BlockJob *job, Error **errp) > > job->driver->complete(job, errp); > > } > > > > +void block_job_pause_point(BlockJob *job) > > +{ > > + if (!block_job_is_paused(job)) { > > I find this check ... > > > + return; > > + } > > + if (block_job_is_cancelled(job)) { > > + return; > > + } > > + > > + if (job->driver->pause) { > > + job->driver->pause(job); > > + } > > + > > + job->paused = true; > > ... and this assignment confusing. After reading more, I think we ought to > rename block_job_is_paused to block_job_should_pause and mark it static, in a > separate patch. > > > + job->busy = false; > > + qemu_coroutine_yield(); /* wait for block_job_resume() */ > > + job->busy = true; > > + job->paused = false; > > Worth to "assert(!job->pause_count)" (or "assert(!block_job_should_pause(job))")? > > Regardless, > > Reviewed-by: Fam Zheng <famz@redhat.com> Nice solution! I hesitated a bit with job->paused vs block_job_is_paused() because the naming is indeed confusing.
On Wed, Jun 15, 2016 at 9:53 AM, Paolo Bonzini <pbonzini@redhat.com> wrote: > On 14/06/2016 20:17, Stefan Hajnoczi wrote: >> Block jobs are coroutines that usually perform I/O but sometimes also >> sleep or yield. Currently only sleeping or yielded block jobs can be >> paused. This means jobs that do not sleep or yield (using >> block_job_yield()) are unaffected by block_job_pause(). >> >> Add block_job_pause_point() so that block jobs can mark quiescent points >> that are suitable for pausing. This solves the problem that it can take >> a block job a long time to pause if it is performing a long series of >> I/O operations. >> >> Transitioning to paused state involves a .pause()/.resume() callback. >> These callbacks are used to ensure that I/O and event loop activity has >> ceased while the job is at a pause point. >> >> Note that this patch introduces a stricter pause state than previously. >> The job->busy flag was incorrectly documented as a quiescent state >> without I/O pending. This is violated by any job that has I/O pending >> across sleep or block_job_yield(), like the mirror block job. > > Right, we should document job->busy as a quiescent state where no one > will re-enter the coroutine. That statement doesn't correspond with how it's used: block_job_sleep_ns() leaves a timer pending and the job will re-enter when the timer expires. So "no one will re-enter the coroutine" is too strict. The important thing is it's safe to call block_job_enter(). In the block_job_sleep_ns() case the timer is cancelled to prevent doubly re-entry. The doc comment I have in v4 allows the block_job_sleep_ns() case: /* * Set to false by the job while the coroutine has yielded and may be * re-entered by block_job_enter(). There may still be I/O or event loop * activity pending. */ bool busy; Stefan
On 16/06/2016 15:17, Stefan Hajnoczi wrote: >> Right, we should document job->busy as a quiescent state where no one >> will re-enter the coroutine. > > That statement doesn't correspond with how it's used: > > block_job_sleep_ns() leaves a timer pending and the job will re-enter > when the timer expires. So "no one will re-enter the coroutine" is > too strict. And of course you're right. :) What I (sloppily) meant was "where the block job code will not re-enter the coroutine", which is what makes it safe to call block_job_enter(). > The important thing is it's safe to call block_job_enter(). In the > block_job_sleep_ns() case the timer is cancelled to prevent doubly > re-entry. > > The doc comment I have in v4 allows the block_job_sleep_ns() case: > > /* > * Set to false by the job while the coroutine has yielded and may be > * re-entered by block_job_enter(). There may still be I/O or event loop > * activity pending. > */ > bool busy; Sounds good! Paolo
diff --git a/blockjob.c b/blockjob.c index 463bccf..1a383d1 100644 --- a/blockjob.c +++ b/blockjob.c @@ -247,6 +247,30 @@ void block_job_complete(BlockJob *job, Error **errp) job->driver->complete(job, errp); } +void block_job_pause_point(BlockJob *job) +{ + if (!block_job_is_paused(job)) { + return; + } + if (block_job_is_cancelled(job)) { + return; + } + + if (job->driver->pause) { + job->driver->pause(job); + } + + job->paused = true; + job->busy = false; + qemu_coroutine_yield(); /* wait for block_job_resume() */ + job->busy = true; + job->paused = false; + + if (job->driver->resume) { + job->driver->resume(job); + } +} + void block_job_pause(BlockJob *job) { job->pause_count++; @@ -360,13 +384,13 @@ void block_job_sleep_ns(BlockJob *job, QEMUClockType type, int64_t ns) return; } - job->busy = false; - if (block_job_is_paused(job)) { - qemu_coroutine_yield(); - } else { + if (!block_job_is_paused(job)) { + job->busy = false; co_aio_sleep_ns(blk_get_aio_context(job->blk), type, ns); + job->busy = true; } - job->busy = true; + + block_job_pause_point(job); } void block_job_yield(BlockJob *job) @@ -378,9 +402,13 @@ void block_job_yield(BlockJob *job) return; } - job->busy = false; - qemu_coroutine_yield(); - job->busy = true; + if (!block_job_is_paused(job)) { + job->busy = false; + qemu_coroutine_yield(); + job->busy = true; + } + + block_job_pause_point(job); } BlockJobInfo *block_job_query(BlockJob *job) diff --git a/include/block/blockjob.h b/include/block/blockjob.h index 00ac418..154c48b 100644 --- a/include/block/blockjob.h +++ b/include/block/blockjob.h @@ -70,6 +70,20 @@ typedef struct BlockJobDriver { * never both. */ void (*abort)(BlockJob *job); + + /** + * If the callback is not NULL, it will be invoked when the job transitions + * into the paused state. Paused jobs must not perform any asynchronous + * I/O or event loop activity. This callback is used to quiesce jobs. + */ + void (*pause)(BlockJob *job); + + /** + * If the callback is not NULL, it will be invoked when the job transitions + * out of the paused state. Any asynchronous I/O or event loop activity + * should be restarted from this callback. + */ + void (*resume)(BlockJob *job); } BlockJobDriver; /** @@ -119,13 +133,19 @@ struct BlockJob { bool user_paused; /** - * Set to false by the job while it is in a quiescent state, where - * no I/O is pending and the job has yielded on any condition - * that is not detected by #aio_poll, such as a timer. + * Set to false by the job while the coroutine has yielded and may be + * re-entered by block_job_enter(). There may still be I/O or event loop + * activity pending. */ bool busy; /** + * Set to true by the job while it is in a quiescent state, where + * no I/O or event loop activity is pending. + */ + bool paused; + + /** * Set to true when the job is ready to be completed. */ bool ready; @@ -299,6 +319,15 @@ bool block_job_is_cancelled(BlockJob *job); BlockJobInfo *block_job_query(BlockJob *job); /** + * block_job_pause_point: + * @job: The job that is ready to pause. + * + * Pause now if block_job_pause() has been called. Block jobs that perform + * lots of I/O must call this between requests so that the job can be paused. + */ +void coroutine_fn block_job_pause_point(BlockJob *job); + +/** * block_job_pause: * @job: The job to be paused. *
Block jobs are coroutines that usually perform I/O but sometimes also sleep or yield. Currently only sleeping or yielded block jobs can be paused. This means jobs that do not sleep or yield (using block_job_yield()) are unaffected by block_job_pause(). Add block_job_pause_point() so that block jobs can mark quiescent points that are suitable for pausing. This solves the problem that it can take a block job a long time to pause if it is performing a long series of I/O operations. Transitioning to paused state involves a .pause()/.resume() callback. These callbacks are used to ensure that I/O and event loop activity has ceased while the job is at a pause point. Note that this patch introduces a stricter pause state than previously. The job->busy flag was incorrectly documented as a quiescent state without I/O pending. This is violated by any job that has I/O pending across sleep or block_job_yield(), like the mirror block job. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> --- blockjob.c | 44 ++++++++++++++++++++++++++++++++++++-------- include/block/blockjob.h | 35 ++++++++++++++++++++++++++++++++--- 2 files changed, 68 insertions(+), 11 deletions(-)