Message ID | 1428069921-2957-3-git-send-email-famz@redhat.com |
---|---|
State | New |
Headers | show |
On Fri, Apr 03, 2015 at 10:05:19PM +0800, Fam Zheng wrote: > + QTAILQ_FOREACH(bs, &bdrv_states, device_list) { > + AioContext *aio_context = bdrv_get_aio_context(bs); > + > + aio_context_acquire(aio_context); > + if (bs->job) { > + block_job_pause(bs->job); > + } > + aio_context_release(aio_context); > + } Not directly a problem in your code, but since I'm playing with the idea of allowing block jobs to reside in any arbitrary node we'll have to figure out a simple and global way to iterate over all block jobs. The one that I wrote is probably not the best one. Maybe we can have block_job_create/completed() maintain a list of jobs instead. https://lists.gnu.org/archive/html/qemu-devel/2015-03/msg04800.html But anyway I don't think it's something for this patchset, therefore: Reviewed-by: Alberto Garcia <berto@igalia.com> Berto
On Fri, Apr 03, 2015 at 10:05:19PM +0800, Fam Zheng wrote: > This is necessary to suppress more IO requests from being generated from > block job coroutines. > > Signed-off-by: Fam Zheng <famz@redhat.com> > --- > block.c | 20 ++++++++++++++++++++ > 1 file changed, 20 insertions(+) > > diff --git a/block.c b/block.c > index f2f8ae7..00cd91e 100644 > --- a/block.c > +++ b/block.c > @@ -2033,6 +2033,16 @@ void bdrv_drain_all(void) > bool busy = true; > BlockDriverState *bs; > > + QTAILQ_FOREACH(bs, &bdrv_states, device_list) { > + AioContext *aio_context = bdrv_get_aio_context(bs); > + > + aio_context_acquire(aio_context); > + if (bs->job) { > + block_job_pause(bs->job); > + } > + aio_context_release(aio_context); > + } > + > while (busy) { > busy = false; > > @@ -2044,6 +2054,16 @@ void bdrv_drain_all(void) > aio_context_release(aio_context); > } > } > + > + QTAILQ_FOREACH(bs, &bdrv_states, device_list) { > + AioContext *aio_context = bdrv_get_aio_context(bs); > + > + aio_context_acquire(aio_context); > + if (bs->job) { > + block_job_resume(bs->job); > + } > + aio_context_release(aio_context); > + } > } There is a tiny chance that we pause a job (which actually just sets job->paused = true but there's no guarantee the job's coroutine reacts to this) right before it terminates. Then aio_poll() enters the coroutine one last time and the job terminates. We then reach the resume portion of bdrv_drain_all() and the job no longer exists. Hopefully nothing started a new job in the meantime. bs->job should either be the original block job or NULL. This code seems under current assumptions, but I just wanted to raise these issues in case someone sees problems that I've missed. Stefan
On Wed, Apr 08, 2015 at 11:37:52AM +0100, Stefan Hajnoczi wrote: > > + QTAILQ_FOREACH(bs, &bdrv_states, device_list) { > > + AioContext *aio_context = bdrv_get_aio_context(bs); > > + > > + aio_context_acquire(aio_context); > > + if (bs->job) { > > + block_job_pause(bs->job); > > + } > > + aio_context_release(aio_context); > > + } > > + > > while (busy) { > > busy = false; > > > > @@ -2044,6 +2054,16 @@ void bdrv_drain_all(void) > > aio_context_release(aio_context); > > } > > } > > + > > + QTAILQ_FOREACH(bs, &bdrv_states, device_list) { > > + AioContext *aio_context = bdrv_get_aio_context(bs); > > + > > + aio_context_acquire(aio_context); > > + if (bs->job) { > > + block_job_resume(bs->job); > > + } > > + aio_context_release(aio_context); > > + } > > } > > There is a tiny chance that we pause a job (which actually just sets > job->paused = true but there's no guarantee the job's coroutine > reacts to this) right before it terminates. Then aio_poll() enters > the coroutine one last time and the job terminates. > > We then reach the resume portion of bdrv_drain_all() and the job no > longer exists. Hopefully nothing started a new job in the meantime. > bs->job should either be the original block job or NULL. > > This code seems under current assumptions, but I just wanted to > raise these issues in case someone sees problems that I've missed. Is it possible that a new job is started in the meantime? If that's the case this will hit the assertion in block_job_resume(). Berto
On Wed, Apr 08, 2015 at 04:56:14PM +0200, Alberto Garcia wrote: > On Wed, Apr 08, 2015 at 11:37:52AM +0100, Stefan Hajnoczi wrote: > > > > + QTAILQ_FOREACH(bs, &bdrv_states, device_list) { > > > + AioContext *aio_context = bdrv_get_aio_context(bs); > > > + > > > + aio_context_acquire(aio_context); > > > + if (bs->job) { > > > + block_job_pause(bs->job); > > > + } > > > + aio_context_release(aio_context); > > > + } > > > + > > > while (busy) { > > > busy = false; > > > > > > @@ -2044,6 +2054,16 @@ void bdrv_drain_all(void) > > > aio_context_release(aio_context); > > > } > > > } > > > + > > > + QTAILQ_FOREACH(bs, &bdrv_states, device_list) { > > > + AioContext *aio_context = bdrv_get_aio_context(bs); > > > + > > > + aio_context_acquire(aio_context); > > > + if (bs->job) { > > > + block_job_resume(bs->job); > > > + } > > > + aio_context_release(aio_context); > > > + } > > > } > > > > There is a tiny chance that we pause a job (which actually just sets > > job->paused = true but there's no guarantee the job's coroutine > > reacts to this) right before it terminates. Then aio_poll() enters > > the coroutine one last time and the job terminates. > > > > We then reach the resume portion of bdrv_drain_all() and the job no > > longer exists. Hopefully nothing started a new job in the meantime. > > bs->job should either be the original block job or NULL. > > > > This code seems under current assumptions, but I just wanted to > > raise these issues in case someone sees problems that I've missed. > > Is it possible that a new job is started in the meantime? If that's > the case this will hit the assertion in block_job_resume(). That is currently not possible since the QEMU monitor does not run while we're waiting in aio_poll(). Therefore no block job monitor commands could spawn a new job. If code is added that spawns a job based on an AioContext timer or due to some other event, then this assumption no longer holds and there is a problem because block_job_resume() is called on a job that never paused. But for now there is no problem. Stefan
diff --git a/block.c b/block.c index f2f8ae7..00cd91e 100644 --- a/block.c +++ b/block.c @@ -2033,6 +2033,16 @@ void bdrv_drain_all(void) bool busy = true; BlockDriverState *bs; + QTAILQ_FOREACH(bs, &bdrv_states, device_list) { + AioContext *aio_context = bdrv_get_aio_context(bs); + + aio_context_acquire(aio_context); + if (bs->job) { + block_job_pause(bs->job); + } + aio_context_release(aio_context); + } + while (busy) { busy = false; @@ -2044,6 +2054,16 @@ void bdrv_drain_all(void) aio_context_release(aio_context); } } + + QTAILQ_FOREACH(bs, &bdrv_states, device_list) { + AioContext *aio_context = bdrv_get_aio_context(bs); + + aio_context_acquire(aio_context); + if (bs->job) { + block_job_resume(bs->job); + } + aio_context_release(aio_context); + } } /* make a BlockDriverState anonymous by removing from bdrv_state and
This is necessary to suppress more IO requests from being generated from block job coroutines. Signed-off-by: Fam Zheng <famz@redhat.com> --- block.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+)