diff mbox

[v2,09/13] block: Drain requests before swapping nodes in bdrv_swap()

Message ID 1433944027-28533-10-git-send-email-kwolf@redhat.com
State New
Headers show

Commit Message

Kevin Wolf June 10, 2015, 1:47 p.m. UTC
bdrv_swap() requires that there are no requests in flight on either of
the two devices. The request coroutine would work on the wrong
BlockDriverState object (with bs->opaque even being interpreted as a
different type potentially) and all sorts of bad things would result
from this.

The currently existing callers mostly ensure that there is no I/O
pending on nodes that are swapped. In detail, this is:

1. Live snapshots. This goes through qmp_transaction(), which calls
   bdrv_drain_all() before doing anything. The command is executed
   synchronously, so no new I/O can be issued concurrently.

2. snapshot=on in bdrv_open(). We're in the middle of opening the image
   (both the original image and its temporary overlay), so there can't
   be any I/O in flight yet.

3. Mirroring. bdrv_drain() is already used on the source device so that
   the mirror doesn't miss anything. However, the main loop runs between
   that and the bdrv_swap() (which is actually a bug, being addressed in
   another series), so there is a small window in which new I/O might be
   issued that would be in flight during bdrv_swap().

It is safer to just drain the request queue of both devices in
bdrv_swap() instead of relying on callers to do the right thing.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Eric Blake June 10, 2015, 11:17 p.m. UTC | #1
On 06/10/2015 07:47 AM, Kevin Wolf wrote:
> bdrv_swap() requires that there are no requests in flight on either of
> the two devices. The request coroutine would work on the wrong
> BlockDriverState object (with bs->opaque even being interpreted as a
> different type potentially) and all sorts of bad things would result
> from this.
> 
> The currently existing callers mostly ensure that there is no I/O
> pending on nodes that are swapped. In detail, this is:
> 
> 1. Live snapshots. This goes through qmp_transaction(), which calls
>    bdrv_drain_all() before doing anything. The command is executed
>    synchronously, so no new I/O can be issued concurrently.
> 
> 2. snapshot=on in bdrv_open(). We're in the middle of opening the image
>    (both the original image and its temporary overlay), so there can't
>    be any I/O in flight yet.
> 
> 3. Mirroring. bdrv_drain() is already used on the source device so that
>    the mirror doesn't miss anything. However, the main loop runs between
>    that and the bdrv_swap() (which is actually a bug, being addressed in
>    another series), so there is a small window in which new I/O might be
>    issued that would be in flight during bdrv_swap().
> 
> It is safer to just drain the request queue of both devices in
> bdrv_swap() instead of relying on callers to do the right thing.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 

Reviewed-by: Eric Blake <eblake@redhat.com>
Max Reitz June 12, 2015, 2:16 p.m. UTC | #2
On 10.06.2015 15:47, Kevin Wolf wrote:
> bdrv_swap() requires that there are no requests in flight on either of
> the two devices. The request coroutine would work on the wrong
> BlockDriverState object (with bs->opaque even being interpreted as a
> different type potentially) and all sorts of bad things would result
> from this.
>
> The currently existing callers mostly ensure that there is no I/O
> pending on nodes that are swapped. In detail, this is:
>
> 1. Live snapshots. This goes through qmp_transaction(), which calls
>     bdrv_drain_all() before doing anything. The command is executed
>     synchronously, so no new I/O can be issued concurrently.
>
> 2. snapshot=on in bdrv_open(). We're in the middle of opening the image
>     (both the original image and its temporary overlay), so there can't
>     be any I/O in flight yet.
>
> 3. Mirroring. bdrv_drain() is already used on the source device so that
>     the mirror doesn't miss anything. However, the main loop runs between
>     that and the bdrv_swap() (which is actually a bug, being addressed in
>     another series), so there is a small window in which new I/O might be
>     issued that would be in flight during bdrv_swap().
>
> It is safer to just drain the request queue of both devices in
> bdrv_swap() instead of relying on callers to do the right thing.
>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>   block.c | 6 ++++++
>   1 file changed, 6 insertions(+)

Reviewed-by: Max Reitz <mreitz@redhat.com>
diff mbox

Patch

diff --git a/block.c b/block.c
index 7ba0edd..f1ceb26 100644
--- a/block.c
+++ b/block.c
@@ -1947,6 +1947,9 @@  void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old)
 {
     BlockDriverState tmp;
 
+    bdrv_drain(bs_new);
+    bdrv_drain(bs_old);
+
     /* The code needs to swap the node_name but simply swapping node_list won't
      * work so first remove the nodes from the graph list, do the swap then
      * insert them back if needed.
@@ -1990,6 +1993,9 @@  void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old)
         QTAILQ_INSERT_TAIL(&graph_bdrv_states, bs_old, node_list);
     }
 
+    assert(QLIST_EMPTY(&bs_old->tracked_requests));
+    assert(QLIST_EMPTY(&bs_new->tracked_requests));
+
     bdrv_rebind(bs_new);
     bdrv_rebind(bs_old);
 }