diff mbox

[07/11] block: let backup blockjob run in BDS AioContext

Message ID 1412182919-9550-8-git-send-email-stefanha@redhat.com
State New
Headers show

Commit Message

Stefan Hajnoczi Oct. 1, 2014, 5:01 p.m. UTC
The backup block job must run in the BlockDriverState AioContext so that
it works with dataplane.

The basics of acquiring the AioContext are easy in blockdev.c.

The completion code in block/backup.c must call bdrv_unref() from the
main loop.  Use block_job_defer_to_main_loop() to achieve that.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/backup.c | 21 +++++++++++++++++++--
 blockdev.c     | 23 ++++++++++++++++-------
 2 files changed, 35 insertions(+), 9 deletions(-)

Comments

Max Reitz Oct. 1, 2014, 7:38 p.m. UTC | #1
On 01.10.2014 19:01, Stefan Hajnoczi wrote:
> The backup block job must run in the BlockDriverState AioContext so that
> it works with dataplane.
>
> The basics of acquiring the AioContext are easy in blockdev.c.
>
> The completion code in block/backup.c must call bdrv_unref() from the
> main loop.  Use block_job_defer_to_main_loop() to achieve that.
>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>   block/backup.c | 21 +++++++++++++++++++--
>   blockdev.c     | 23 ++++++++++++++++-------
>   2 files changed, 35 insertions(+), 9 deletions(-)

Hm, so, if I run a backup blockjob from one thread (which is not the 
main loop and not the BDS's AIO context) and try to cancel it or set its 
speed from another thread (e.g. the main loop), won't the blockjob hold 
the BDS's AIO context lock as long as it runs and making it impossible 
to interfere?

Max
Stefan Hajnoczi Oct. 2, 2014, 3:08 p.m. UTC | #2
On Wed, Oct 01, 2014 at 09:38:39PM +0200, Max Reitz wrote:
> On 01.10.2014 19:01, Stefan Hajnoczi wrote:
> >The backup block job must run in the BlockDriverState AioContext so that
> >it works with dataplane.
> >
> >The basics of acquiring the AioContext are easy in blockdev.c.
> >
> >The completion code in block/backup.c must call bdrv_unref() from the
> >main loop.  Use block_job_defer_to_main_loop() to achieve that.
> >
> >Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> >---
> >  block/backup.c | 21 +++++++++++++++++++--
> >  blockdev.c     | 23 ++++++++++++++++-------
> >  2 files changed, 35 insertions(+), 9 deletions(-)
> 
> Hm, so, if I run a backup blockjob from one thread (which is not the main
> loop and not the BDS's AIO context)

That cannot happen.  The blockjob always runs with the BDS AioContext
acquired - either because we explicitly did so in the main loop or
because the dataplane IOThread acquires it.

> and try to cancel it or set its speed
> from another thread (e.g. the main loop), won't the blockjob hold the BDS's
> AIO context lock as long as it runs and making it impossible to interfere?

No, because of how the AioContext's RfifoLock lock works.

It has a "contention callback" that kicks the aio_poll() loop.  This
allows the QEMU monitor to acquire the lock whenever an IOThread is
blocked in poll(2).  It uses an eventfd to force the IOThread out of
poll(2) and this happens automatically in aio_context_acquire().

See aio_context_new() and aio_rfifolock_cb().

Stefan
Max Reitz Oct. 4, 2014, 7:54 p.m. UTC | #3
On 02.10.2014 17:08, Stefan Hajnoczi wrote:
> On Wed, Oct 01, 2014 at 09:38:39PM +0200, Max Reitz wrote:
>> On 01.10.2014 19:01, Stefan Hajnoczi wrote:
>>> The backup block job must run in the BlockDriverState AioContext so that
>>> it works with dataplane.
>>>
>>> The basics of acquiring the AioContext are easy in blockdev.c.
>>>
>>> The completion code in block/backup.c must call bdrv_unref() from the
>>> main loop.  Use block_job_defer_to_main_loop() to achieve that.
>>>
>>> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
>>> ---
>>>   block/backup.c | 21 +++++++++++++++++++--
>>>   blockdev.c     | 23 ++++++++++++++++-------
>>>   2 files changed, 35 insertions(+), 9 deletions(-)
>> Hm, so, if I run a backup blockjob from one thread (which is not the main
>> loop and not the BDS's AIO context)
> That cannot happen.  The blockjob always runs with the BDS AioContext
> acquired - either because we explicitly did so in the main loop or
> because the dataplane IOThread acquires it.
>
>> and try to cancel it or set its speed
>> from another thread (e.g. the main loop), won't the blockjob hold the BDS's
>> AIO context lock as long as it runs and making it impossible to interfere?
> No, because of how the AioContext's RfifoLock lock works.
>
> It has a "contention callback" that kicks the aio_poll() loop.  This
> allows the QEMU monitor to acquire the lock whenever an IOThread is
> blocked in poll(2).  It uses an eventfd to force the IOThread out of
> poll(2) and this happens automatically in aio_context_acquire().
>
> See aio_context_new() and aio_rfifolock_cb().

Ah, okay, thank you for clarifying that.

Reviewed-by: Max Reitz <mreitz@redhat.com>
diff mbox

Patch

diff --git a/block/backup.c b/block/backup.c
index d0b0225..9d015b5 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -227,9 +227,25 @@  static BlockErrorAction backup_error_action(BackupBlockJob *job,
     }
 }
 
+typedef struct {
+    int ret;
+} BackupCompleteData;
+
+static void backup_complete(BlockJob *job, void *opaque)
+{
+    BackupBlockJob *s = container_of(job, BackupBlockJob, common);
+    BackupCompleteData *data = opaque;
+
+    bdrv_unref(s->target);
+
+    block_job_completed(job, data->ret);
+    g_free(data);
+}
+
 static void coroutine_fn backup_run(void *opaque)
 {
     BackupBlockJob *job = opaque;
+    BackupCompleteData *data;
     BlockDriverState *bs = job->common.bs;
     BlockDriverState *target = job->target;
     BlockdevOnError on_target_error = job->on_target_error;
@@ -344,9 +360,10 @@  static void coroutine_fn backup_run(void *opaque)
     hbitmap_free(job->bitmap);
 
     bdrv_iostatus_disable(target);
-    bdrv_unref(target);
 
-    block_job_completed(&job->common, ret);
+    data = g_malloc(sizeof(*data));
+    data->ret = ret;
+    block_job_defer_to_main_loop(&job->common, backup_complete, data);
 }
 
 void backup_start(BlockDriverState *bs, BlockDriverState *target,
diff --git a/blockdev.c b/blockdev.c
index 1c79352..44f0cc7 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2068,6 +2068,7 @@  void qmp_drive_backup(const char *device, const char *target,
     BlockDriverState *bs;
     BlockDriverState *target_bs;
     BlockDriverState *source = NULL;
+    AioContext *aio_context;
     BlockDriver *drv = NULL;
     Error *local_err = NULL;
     int flags;
@@ -2093,9 +2094,12 @@  void qmp_drive_backup(const char *device, const char *target,
         return;
     }
 
+    aio_context = bdrv_get_aio_context(bs);
+    aio_context_acquire(aio_context);
+
     if (!bdrv_is_inserted(bs)) {
         error_set(errp, QERR_DEVICE_HAS_NO_MEDIUM, device);
-        return;
+        goto out;
     }
 
     if (!has_format) {
@@ -2105,12 +2109,12 @@  void qmp_drive_backup(const char *device, const char *target,
         drv = bdrv_find_format(format);
         if (!drv) {
             error_set(errp, QERR_INVALID_BLOCK_FORMAT, format);
-            return;
+            goto out;
         }
     }
 
     if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKUP_SOURCE, errp)) {
-        return;
+        goto out;
     }
 
     flags = bs->open_flags | BDRV_O_RDWR;
@@ -2130,7 +2134,7 @@  void qmp_drive_backup(const char *device, const char *target,
     size = bdrv_getlength(bs);
     if (size < 0) {
         error_setg_errno(errp, -size, "bdrv_getlength failed");
-        return;
+        goto out;
     }
 
     if (mode != NEW_IMAGE_MODE_EXISTING) {
@@ -2147,23 +2151,28 @@  void qmp_drive_backup(const char *device, const char *target,
 
     if (local_err) {
         error_propagate(errp, local_err);
-        return;
+        goto out;
     }
 
     target_bs = NULL;
     ret = bdrv_open(&target_bs, target, NULL, NULL, flags, drv, &local_err);
     if (ret < 0) {
         error_propagate(errp, local_err);
-        return;
+        goto out;
     }
 
+    bdrv_set_aio_context(target_bs, aio_context);
+
     backup_start(bs, target_bs, speed, sync, on_source_error, on_target_error,
                  block_job_cb, bs, &local_err);
     if (local_err != NULL) {
         bdrv_unref(target_bs);
         error_propagate(errp, local_err);
-        return;
+        goto out;
     }
+
+out:
+    aio_context_release(aio_context);
 }
 
 BlockDeviceInfoList *qmp_query_named_block_nodes(Error **errp)