Patchwork [3/3] block: convert qemu_aio_flush() calls to bdrv_drain_all()

login
register
mail settings
Submitter Stefan Hajnoczi
Date Nov. 30, 2011, 12:23 p.m.
Message ID <1322655824-31778-4-git-send-email-stefanha@linux.vnet.ibm.com>
Download mbox | patch
Permalink /patch/128494/
State New
Headers show

Comments

Stefan Hajnoczi - Nov. 30, 2011, 12:23 p.m.
Many places in QEMU call qemu_aio_flush() to complete all pending
asynchronous I/O.  Most of these places actually want to drain all block
requests but there is block layer API to do so.

This patch introduces the bdrv_drain_all() API to wait for requests
across all BlockDriverStates to complete.  As a bonus we perform checks
after qemu_aio_wait() to ensure that requests really have finished.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
---
 block-migration.c |    2 +-
 block.c           |   19 +++++++++++++++++++
 block.h           |    1 +
 blockdev.c        |    4 ++--
 cpus.c            |    2 +-
 hw/ide/macio.c    |    5 +++--
 hw/ide/pci.c      |    2 +-
 hw/virtio-blk.c   |    2 +-
 hw/xen_platform.c |    2 +-
 qemu-io.c         |    4 ++--
 savevm.c          |    2 +-
 xen-mapcache.c    |    2 +-
 12 files changed, 34 insertions(+), 13 deletions(-)
Christoph Hellwig - Nov. 30, 2011, 12:28 p.m.
On Wed, Nov 30, 2011 at 12:23:43PM +0000, Stefan Hajnoczi wrote:
> Many places in QEMU call qemu_aio_flush() to complete all pending
> asynchronous I/O.  Most of these places actually want to drain all block
> requests but there is block layer API to do so.

there seems to be a "not" missing in the last half sentence.

> 
> This patch introduces the bdrv_drain_all() API to wait for requests
> across all BlockDriverStates to complete.  As a bonus we perform checks
> after qemu_aio_wait() to ensure that requests really have finished.

It looks like all but four of the callers actually just want to drain a 
single BlockDriverState.  And one of those four already has its own loop
over all BlockDriverStates.
Stefan Hajnoczi - Nov. 30, 2011, 12:54 p.m.
On Wed, Nov 30, 2011 at 12:28 PM, Christoph Hellwig <hch@lst.de> wrote:
> On Wed, Nov 30, 2011 at 12:23:43PM +0000, Stefan Hajnoczi wrote:
>> Many places in QEMU call qemu_aio_flush() to complete all pending
>> asynchronous I/O.  Most of these places actually want to drain all block
>> requests but there is block layer API to do so.
>
> there seems to be a "not" missing in the last half sentence.
>
>>
>> This patch introduces the bdrv_drain_all() API to wait for requests
>> across all BlockDriverStates to complete.  As a bonus we perform checks
>> after qemu_aio_wait() to ensure that requests really have finished.
>
> It looks like all but four of the callers actually just want to drain a
> single BlockDriverState.  And one of those four already has its own loop
> over all BlockDriverStates.

Yes, we still don't have an interface for waiting on just one
BlockDriverState.  virtio-blk even has a comment about the fact that
there is no per-block device way of waiting.

I think this should be done later since it is independent of adding
these asserts after qemu_aio_flush() (which is what this patch does).

Stefan

Patch

diff --git a/block-migration.c b/block-migration.c
index 5f104864..423c5a0 100644
--- a/block-migration.c
+++ b/block-migration.c
@@ -387,7 +387,7 @@  static int mig_save_device_dirty(Monitor *mon, QEMUFile *f,
 
     for (sector = bmds->cur_dirty; sector < bmds->total_sectors;) {
         if (bmds_aio_inflight(bmds, sector)) {
-            qemu_aio_flush();
+            bdrv_drain_all();
         }
         if (bdrv_get_dirty(bmds->bs, sector)) {
 
diff --git a/block.c b/block.c
index 37a96f3..86807d6 100644
--- a/block.c
+++ b/block.c
@@ -846,6 +846,25 @@  void bdrv_close_all(void)
     }
 }
 
+/*
+ * Wait for pending requests to complete across all BlockDriverStates
+ *
+ * This function does not flush data to disk, use bdrv_flush_all() for that
+ * after calling this function.
+ */
+void bdrv_drain_all(void)
+{
+    BlockDriverState *bs;
+
+    qemu_aio_flush();
+
+    /* If requests are still pending there is a bug somewhere */
+    QTAILQ_FOREACH(bs, &bdrv_states, list) {
+        assert(QLIST_EMPTY(&bs->tracked_requests));
+        assert(qemu_co_queue_empty(&bs->throttled_reqs));
+    }
+}
+
 /* make a BlockDriverState anonymous by removing from bdrv_state list.
    Also, NULL terminate the device_name to prevent double remove */
 void bdrv_make_anon(BlockDriverState *bs)
diff --git a/block.h b/block.h
index 1d28e85..0dc4905 100644
--- a/block.h
+++ b/block.h
@@ -214,6 +214,7 @@  int bdrv_flush(BlockDriverState *bs);
 int coroutine_fn bdrv_co_flush(BlockDriverState *bs);
 void bdrv_flush_all(void);
 void bdrv_close_all(void);
+void bdrv_drain_all(void);
 
 int bdrv_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors);
 int bdrv_co_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors);
diff --git a/blockdev.c b/blockdev.c
index af4e239..dbf0251 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -653,7 +653,7 @@  int do_snapshot_blkdev(Monitor *mon, const QDict *qdict, QObject **ret_data)
         goto out;
     }
 
-    qemu_aio_flush();
+    bdrv_drain_all();
     bdrv_flush(bs);
 
     bdrv_close(bs);
@@ -840,7 +840,7 @@  int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data)
     }
 
     /* quiesce block driver; prevent further io */
-    qemu_aio_flush();
+    bdrv_drain_all();
     bdrv_flush(bs);
     bdrv_close(bs);
 
diff --git a/cpus.c b/cpus.c
index 82530c4..58e173c 100644
--- a/cpus.c
+++ b/cpus.c
@@ -396,7 +396,7 @@  static void do_vm_stop(RunState state)
         pause_all_vcpus();
         runstate_set(state);
         vm_state_notify(0, state);
-        qemu_aio_flush();
+        bdrv_drain_all();
         bdrv_flush_all();
         monitor_protocol_event(QEVENT_STOP, NULL);
     }
diff --git a/hw/ide/macio.c b/hw/ide/macio.c
index 70b3342..c09d2e0 100644
--- a/hw/ide/macio.c
+++ b/hw/ide/macio.c
@@ -200,8 +200,9 @@  static void pmac_ide_flush(DBDMA_io *io)
 {
     MACIOIDEState *m = io->opaque;
 
-    if (m->aiocb)
-        qemu_aio_flush();
+    if (m->aiocb) {
+        bdrv_drain_all();
+    }
 }
 
 /* PowerMac IDE memory IO */
diff --git a/hw/ide/pci.c b/hw/ide/pci.c
index 49b823d..5078c0b 100644
--- a/hw/ide/pci.c
+++ b/hw/ide/pci.c
@@ -309,7 +309,7 @@  void bmdma_cmd_writeb(BMDMAState *bm, uint32_t val)
              * aio operation with preadv/pwritev.
              */
             if (bm->bus->dma->aiocb) {
-                qemu_aio_flush();
+                bdrv_drain_all();
                 assert(bm->bus->dma->aiocb == NULL);
                 assert((bm->status & BM_STATUS_DMAING) == 0);
             }
diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c
index d6d1f87..4b0d113 100644
--- a/hw/virtio-blk.c
+++ b/hw/virtio-blk.c
@@ -474,7 +474,7 @@  static void virtio_blk_reset(VirtIODevice *vdev)
      * This should cancel pending requests, but can't do nicely until there
      * are per-device request lists.
      */
-    qemu_aio_flush();
+    bdrv_drain_all();
 }
 
 /* coalesce internal state, copy to pci i/o region 0
diff --git a/hw/xen_platform.c b/hw/xen_platform.c
index 5e792f5..e62eaef 100644
--- a/hw/xen_platform.c
+++ b/hw/xen_platform.c
@@ -120,7 +120,7 @@  static void platform_fixed_ioport_writew(void *opaque, uint32_t addr, uint32_t v
            devices, and bit 2 the non-primary-master IDE devices. */
         if (val & UNPLUG_ALL_IDE_DISKS) {
             DPRINTF("unplug disks\n");
-            qemu_aio_flush();
+            bdrv_drain_all();
             bdrv_flush_all();
             pci_unplug_disks(s->pci_dev.bus);
         }
diff --git a/qemu-io.c b/qemu-io.c
index de26422..b35adbb 100644
--- a/qemu-io.c
+++ b/qemu-io.c
@@ -1853,9 +1853,9 @@  int main(int argc, char **argv)
     command_loop();
 
     /*
-     * Make sure all outstanding requests get flushed the program exits.
+     * Make sure all outstanding requests complete before the program exits.
      */
-    qemu_aio_flush();
+    bdrv_drain_all();
 
     if (bs) {
         bdrv_delete(bs);
diff --git a/savevm.c b/savevm.c
index f53cd4c..0443554 100644
--- a/savevm.c
+++ b/savevm.c
@@ -2104,7 +2104,7 @@  int load_vmstate(const char *name)
     }
 
     /* Flush all IO requests so they don't interfere with the new state.  */
-    qemu_aio_flush();
+    bdrv_drain_all();
 
     bs = NULL;
     while ((bs = bdrv_next(bs))) {
diff --git a/xen-mapcache.c b/xen-mapcache.c
index 7bcb86e..9fecc64 100644
--- a/xen-mapcache.c
+++ b/xen-mapcache.c
@@ -351,7 +351,7 @@  void xen_invalidate_map_cache(void)
     MapCacheRev *reventry;
 
     /* Flush pending AIO before destroying the mapcache */
-    qemu_aio_flush();
+    bdrv_drain_all();
 
     QTAILQ_FOREACH(reventry, &mapcache->locked_entries, next) {
         DPRINTF("There should be no locked mappings at this time, "