Message ID | 1358359893-718-1-git-send-email-kwolf@redhat.com |
---|---|
State | New |
Headers | show |
Il 16/01/2013 19:11, Kevin Wolf ha scritto: > aio_poll() must return true if any work is still pending, even if it > didn't make progress, so that qemu_aio_wait() doesn't return too early. > The possibility of returning early occasionally lead to a failed > assertion in bdrv_drain_all(), when some in-flight request was missed > and the function didn't really drain all requests. > > In order to make that change, the return value as specified in the > function comment must change for blocking = false; fortunately, the > return value of blocking = false callers is only used in test cases, so > this change shouldn't cause any trouble. > > Signed-off-by: Kevin Wolf <kwolf@redhat.com> Looks good, but please make the same change in aio-win32.c, and add a Cc to qemu-stable@nongnu.org Paolo > --- > aio-posix.c | 3 ++- > include/block/aio.h | 6 ++---- > tests/test-aio.c | 4 ++-- > 3 files changed, 6 insertions(+), 7 deletions(-) > > diff --git a/aio-posix.c b/aio-posix.c > index 88d09e1..fe4dbb4 100644 > --- a/aio-posix.c > +++ b/aio-posix.c > @@ -264,5 +264,6 @@ bool aio_poll(AioContext *ctx, bool blocking) > } > } > > - return progress; > + assert(progress || busy); > + return true; > } > diff --git a/include/block/aio.h b/include/block/aio.h > index 0933f05..8eda924 100644 > --- a/include/block/aio.h > +++ b/include/block/aio.h > @@ -173,16 +173,14 @@ bool aio_pending(AioContext *ctx); > * aio as a result of executing I/O completion or bh callbacks. > * > * If there is no pending AIO operation or completion (bottom half), > - * return false. If there are pending bottom halves, return true. > + * return false. If there are pending AIO operations of bottom halves, > + * return true. > * > * If there are no pending bottom halves, but there are pending AIO > * operations, it may not be possible to make any progress without > * blocking. If @blocking is true, this function will wait until one > * or more AIO events have completed, to ensure something has moved > * before returning. > - * > - * If @blocking is false, this function will also return false if the > - * function cannot make any progress without blocking. > */ > bool aio_poll(AioContext *ctx, bool blocking); > > diff --git a/tests/test-aio.c b/tests/test-aio.c > index e4ebef7..c173870 100644 > --- a/tests/test-aio.c > +++ b/tests/test-aio.c > @@ -315,13 +315,13 @@ static void test_wait_event_notifier_noflush(void) > event_notifier_set(&data.e); > g_assert(aio_poll(ctx, false)); > g_assert_cmpint(data.n, ==, 1); > - g_assert(!aio_poll(ctx, false)); > + g_assert(aio_poll(ctx, false)); > g_assert_cmpint(data.n, ==, 1); > > event_notifier_set(&data.e); > g_assert(aio_poll(ctx, false)); > g_assert_cmpint(data.n, ==, 2); > - g_assert(!aio_poll(ctx, false)); > + g_assert(aio_poll(ctx, false)); > g_assert_cmpint(data.n, ==, 2); > > event_notifier_set(&dummy.e); >
> aio_poll() must return true if any work is still pending, even if it didn't make > progress, so that qemu_aio_wait() doesn't return too early. > The possibility of returning early occasionally lead to a failed assertion in > bdrv_drain_all(), when some in-flight request was missed and the function > didn't really drain all requests. I still have problem with bdrv_drain_all() and my backup block job. If I reset/suspend/resume/ the VM during a backup job run I get: block.c:1221: bdrv_drain_all: Assertion `((&bs->tracked_requests)->lh_first == ((void *)0))' failed. Aborted I am not 100% sure, but I think a simple qemu_aio_wait() is not enough to ensure that a copy-on-write action has finished. Any idea how to solve that problem?
Il 22/01/2013 08:02, Dietmar Maurer ha scritto: >> aio_poll() must return true if any work is still pending, even if it didn't make >> > progress, so that qemu_aio_wait() doesn't return too early. >> > The possibility of returning early occasionally lead to a failed assertion in >> > bdrv_drain_all(), when some in-flight request was missed and the function >> > didn't really drain all requests. > I still have problem with bdrv_drain_all() and my backup block job. > If I reset/suspend/resume/ the VM during a backup job run I get: > > block.c:1221: bdrv_drain_all: Assertion `((&bs->tracked_requests)->lh_first == ((void *)0))' failed. > Aborted > > I am not 100% sure, but I think a simple qemu_aio_wait() is not enough to ensure that > a copy-on-write action has finished. Are you using timers in any way? Paolo > Any idea how to solve that problem? > > > > > >
> > If I reset/suspend/resume/ the VM during a backup job run I get: > > > > block.c:1221: bdrv_drain_all: Assertion `((&bs->tracked_requests)->lh_first > == ((void *)0))' failed. > > Aborted > > > > I am not 100% sure, but I think a simple qemu_aio_wait() is not enough > > to ensure that a copy-on-write action has finished. > > Are you using timers in any way? Yes, I call co_sleep_ns(rt_clock, delay) to limit rate to output stream.
> > > If I reset/suspend/resume/ the VM during a backup job run I get: > > > > > > block.c:1221: bdrv_drain_all: Assertion > > > `((&bs->tracked_requests)->lh_first > > == ((void *)0))' failed. > > > Aborted > > > > > > I am not 100% sure, but I think a simple qemu_aio_wait() is not > > > enough > > > to ensure that a copy-on-write action has finished. > > > > Are you using timers in any way? > > Yes, I call co_sleep_ns(rt_clock, delay) to limit rate to output > stream. Use block_job_sleep_ns instead, and only call it when no I/O is pending. Paolo
> > > Are you using timers in any way? > > > > Yes, I call co_sleep_ns(rt_clock, delay) to limit rate to output > > stream. > > Use block_job_sleep_ns instead, and only call it when no I/O is pending. Thanks, that works!
> > > > Are you using timers in any way? > > > > > > Yes, I call co_sleep_ns(rt_clock, delay) to limit rate to output > > > stream. > > > > Use block_job_sleep_ns instead, and only call it when no I/O is pending. > > Thanks, that works! I currently use qemu_aio_set_fd_handler() to implement async output to the backup stream. While that works, performance is much better (during backup) if I use a separate thread to write the data. Is that known behavior, or should aio give about the same performance as using a thread? Or would I get better performance if I use Linux native AIO support?
Il 23/01/2013 09:58, Dietmar Maurer ha scritto: >>>>> Are you using timers in any way? >>>> >>>> Yes, I call co_sleep_ns(rt_clock, delay) to limit rate to output >>>> stream. >>> >>> Use block_job_sleep_ns instead, and only call it when no I/O is pending. >> >> Thanks, that works! > > I currently use qemu_aio_set_fd_handler() to implement async output to > the backup stream. While that works, performance is much better (during backup) > if I use a separate thread to write the data. > > Is that known behavior, or should aio give about the same performance as > using a thread? It depends, but this is what we saw for migration too. You may also have less contention on the big QEMU lock if you use a thread. > Or would I get better performance if I use Linux native AIO support? That also depends, but in general it should be faster. Paolo
> > I currently use qemu_aio_set_fd_handler() to implement async output to > > the backup stream. While that works, performance is much better > > (during backup) if I use a separate thread to write the data. > > > > Is that known behavior, or should aio give about the same performance > > as using a thread? > > It depends, but this is what we saw for migration too. You may also have > less contention on the big QEMU lock if you use a thread. But when I use a thread it triggers the bug in bdrv_drain_all(). So how can I fix bdrv_drain_all() if I use a separate thread to write data? I currently use CoQueue to wait for the output thread. Maybe there is some example code somewhere? > > Or would I get better performance if I use Linux native AIO support? > > That also depends, but in general it should be faster. Is there an easy way to switch to native AIO? The code in block/linux-aio.c is only usable inside block drivers? I simply need to write to a unix file descriptor.
Il 23/01/2013 12:37, Dietmar Maurer ha scritto: >>> I currently use qemu_aio_set_fd_handler() to implement async output to >>> the backup stream. While that works, performance is much better >>> (during backup) if I use a separate thread to write the data. >>> >>> Is that known behavior, or should aio give about the same performance >>> as using a thread? >> >> It depends, but this is what we saw for migration too. You may also have >> less contention on the big QEMU lock if you use a thread. > > But when I use a thread it triggers the bug in bdrv_drain_all(). So how can > I fix bdrv_drain_all() if I use a separate thread to write data? The bug is, in all likelihood, in your own code. Sorry. :) > I currently use CoQueue to wait for the output thread. The simplest way to synchronize the main event loop with another thread is a bottom half. There is no example yet, but I will post it soon for migration. In the other direction, use an EventNotifier (for which bottom halves are just a nice wrapper). > Maybe there is some example code somewhere? > >>> Or would I get better performance if I use Linux native AIO support? >> >> That also depends, but in general it should be faster. > > Is there an easy way to switch to native AIO? The code in block/linux-aio.c is only > usable inside block drivers? I simply need to write to a unix file descriptor. First of all, AIO is only usable for files and block devices, not character devices or sockets. If you have one, you can use the block device API rather than the file descriptor API to write it. You'll get native AIO support for free. Support for running the block drivers in a separate thread is not there yet, but I predict it will come later this year. Paolo
> >>> Is that known behavior, or should aio give about the same > >>> performance as using a thread? > >> > >> It depends, but this is what we saw for migration too. You may also > >> have less contention on the big QEMU lock if you use a thread. I managed to fix the performance issue by doing the following: - use O_DIRECT to avoid host cache on write - limit write size to max 4096 bytes (throughput gets a bit lower, but VM is much more responsive during backup) Now I get about the same behavior/performance as using a thread. Unfortunately, the bug in bdrv_drain_all() trigger more often. > > But when I use a thread it triggers the bug in bdrv_drain_all(). So > > how can I fix bdrv_drain_all() if I use a separate thread to write data? > > The bug is, in all likelihood, in your own code. Sorry. :) > > > I currently use CoQueue to wait for the output thread. > > The simplest way to synchronize the main event loop with another thread is > a bottom half. There is no example yet, but I will post it soon for migration. The problem is not really how I can synchronize. The problem is that I need to halt the blockjob when the output buffer is full, and this causes the bug in bdrv_drain_all(). I am currently a bit out of ideas - will do more testing.
> > But when I use a thread it triggers the bug in bdrv_drain_all(). So > > how can I fix bdrv_drain_all() if I use a separate thread to write data? > > The bug is, in all likelihood, in your own code. Sorry. :) yes. I still not fully understand that aio code. If I detect a incomplete write, I register an aio handler like this: ret = write(fd, buf, count) if (ret < 0 && (errno == EAGAIN || errno == EWOULDBLOCK) ) { ... qemu_aio_set_fd_handler(fd, NULL, vma_co_continue_write, NULL, mydata); qemu_coroutine_yield(); But seems that qemu_aio_wait() returns false after that. I thought that should return true instead?
Il 24/01/2013 08:42, Dietmar Maurer ha scritto: >>> But when I use a thread it triggers the bug in bdrv_drain_all(). So >>> how can I fix bdrv_drain_all() if I use a separate thread to write data? >> >> The bug is, in all likelihood, in your own code. Sorry. :) > > yes. I still not fully understand that aio code. > > If I detect a incomplete write, I register an aio handler like this: > > ret = write(fd, buf, count) > if (ret < 0 && (errno == EAGAIN || errno == EWOULDBLOCK) ) { > ... > qemu_aio_set_fd_handler(fd, NULL, vma_co_continue_write, NULL, mydata); > qemu_coroutine_yield(); > > But seems that qemu_aio_wait() returns false after that. > > I thought that should return true instead? You need a flush handler (fourth argument to qemu_aio_set_fd_handler) that returns true. Paolo
> > If I detect a incomplete write, I register an aio handler like this: > > > > ret = write(fd, buf, count) > > if (ret < 0 && (errno == EAGAIN || errno == EWOULDBLOCK) ) { > > ... > > qemu_aio_set_fd_handler(fd, NULL, vma_co_continue_write, > NULL, mydata); > > qemu_coroutine_yield(); > > > > But seems that qemu_aio_wait() returns false after that. > > > > I thought that should return true instead? > > You need a flush handler (fourth argument to qemu_aio_set_fd_handler) > that returns true. Ok, that helps :-) Many thanks!
diff --git a/aio-posix.c b/aio-posix.c index 88d09e1..fe4dbb4 100644 --- a/aio-posix.c +++ b/aio-posix.c @@ -264,5 +264,6 @@ bool aio_poll(AioContext *ctx, bool blocking) } } - return progress; + assert(progress || busy); + return true; } diff --git a/include/block/aio.h b/include/block/aio.h index 0933f05..8eda924 100644 --- a/include/block/aio.h +++ b/include/block/aio.h @@ -173,16 +173,14 @@ bool aio_pending(AioContext *ctx); * aio as a result of executing I/O completion or bh callbacks. * * If there is no pending AIO operation or completion (bottom half), - * return false. If there are pending bottom halves, return true. + * return false. If there are pending AIO operations of bottom halves, + * return true. * * If there are no pending bottom halves, but there are pending AIO * operations, it may not be possible to make any progress without * blocking. If @blocking is true, this function will wait until one * or more AIO events have completed, to ensure something has moved * before returning. - * - * If @blocking is false, this function will also return false if the - * function cannot make any progress without blocking. */ bool aio_poll(AioContext *ctx, bool blocking); diff --git a/tests/test-aio.c b/tests/test-aio.c index e4ebef7..c173870 100644 --- a/tests/test-aio.c +++ b/tests/test-aio.c @@ -315,13 +315,13 @@ static void test_wait_event_notifier_noflush(void) event_notifier_set(&data.e); g_assert(aio_poll(ctx, false)); g_assert_cmpint(data.n, ==, 1); - g_assert(!aio_poll(ctx, false)); + g_assert(aio_poll(ctx, false)); g_assert_cmpint(data.n, ==, 1); event_notifier_set(&data.e); g_assert(aio_poll(ctx, false)); g_assert_cmpint(data.n, ==, 2); - g_assert(!aio_poll(ctx, false)); + g_assert(aio_poll(ctx, false)); g_assert_cmpint(data.n, ==, 2); event_notifier_set(&dummy.e);
aio_poll() must return true if any work is still pending, even if it didn't make progress, so that qemu_aio_wait() doesn't return too early. The possibility of returning early occasionally lead to a failed assertion in bdrv_drain_all(), when some in-flight request was missed and the function didn't really drain all requests. In order to make that change, the return value as specified in the function comment must change for blocking = false; fortunately, the return value of blocking = false callers is only used in test cases, so this change shouldn't cause any trouble. Signed-off-by: Kevin Wolf <kwolf@redhat.com> --- aio-posix.c | 3 ++- include/block/aio.h | 6 ++---- tests/test-aio.c | 4 ++-- 3 files changed, 6 insertions(+), 7 deletions(-)