Message ID | 1433944027-28533-10-git-send-email-kwolf@redhat.com |
---|---|
State | New |
Headers | show |
On 06/10/2015 07:47 AM, Kevin Wolf wrote: > bdrv_swap() requires that there are no requests in flight on either of > the two devices. The request coroutine would work on the wrong > BlockDriverState object (with bs->opaque even being interpreted as a > different type potentially) and all sorts of bad things would result > from this. > > The currently existing callers mostly ensure that there is no I/O > pending on nodes that are swapped. In detail, this is: > > 1. Live snapshots. This goes through qmp_transaction(), which calls > bdrv_drain_all() before doing anything. The command is executed > synchronously, so no new I/O can be issued concurrently. > > 2. snapshot=on in bdrv_open(). We're in the middle of opening the image > (both the original image and its temporary overlay), so there can't > be any I/O in flight yet. > > 3. Mirroring. bdrv_drain() is already used on the source device so that > the mirror doesn't miss anything. However, the main loop runs between > that and the bdrv_swap() (which is actually a bug, being addressed in > another series), so there is a small window in which new I/O might be > issued that would be in flight during bdrv_swap(). > > It is safer to just drain the request queue of both devices in > bdrv_swap() instead of relying on callers to do the right thing. > > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > --- > block.c | 6 ++++++ > 1 file changed, 6 insertions(+) > Reviewed-by: Eric Blake <eblake@redhat.com>
On 10.06.2015 15:47, Kevin Wolf wrote: > bdrv_swap() requires that there are no requests in flight on either of > the two devices. The request coroutine would work on the wrong > BlockDriverState object (with bs->opaque even being interpreted as a > different type potentially) and all sorts of bad things would result > from this. > > The currently existing callers mostly ensure that there is no I/O > pending on nodes that are swapped. In detail, this is: > > 1. Live snapshots. This goes through qmp_transaction(), which calls > bdrv_drain_all() before doing anything. The command is executed > synchronously, so no new I/O can be issued concurrently. > > 2. snapshot=on in bdrv_open(). We're in the middle of opening the image > (both the original image and its temporary overlay), so there can't > be any I/O in flight yet. > > 3. Mirroring. bdrv_drain() is already used on the source device so that > the mirror doesn't miss anything. However, the main loop runs between > that and the bdrv_swap() (which is actually a bug, being addressed in > another series), so there is a small window in which new I/O might be > issued that would be in flight during bdrv_swap(). > > It is safer to just drain the request queue of both devices in > bdrv_swap() instead of relying on callers to do the right thing. > > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > --- > block.c | 6 ++++++ > 1 file changed, 6 insertions(+) Reviewed-by: Max Reitz <mreitz@redhat.com>
diff --git a/block.c b/block.c index 7ba0edd..f1ceb26 100644 --- a/block.c +++ b/block.c @@ -1947,6 +1947,9 @@ void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old) { BlockDriverState tmp; + bdrv_drain(bs_new); + bdrv_drain(bs_old); + /* The code needs to swap the node_name but simply swapping node_list won't * work so first remove the nodes from the graph list, do the swap then * insert them back if needed. @@ -1990,6 +1993,9 @@ void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old) QTAILQ_INSERT_TAIL(&graph_bdrv_states, bs_old, node_list); } + assert(QLIST_EMPTY(&bs_old->tracked_requests)); + assert(QLIST_EMPTY(&bs_new->tracked_requests)); + bdrv_rebind(bs_new); bdrv_rebind(bs_old); }
bdrv_swap() requires that there are no requests in flight on either of the two devices. The request coroutine would work on the wrong BlockDriverState object (with bs->opaque even being interpreted as a different type potentially) and all sorts of bad things would result from this. The currently existing callers mostly ensure that there is no I/O pending on nodes that are swapped. In detail, this is: 1. Live snapshots. This goes through qmp_transaction(), which calls bdrv_drain_all() before doing anything. The command is executed synchronously, so no new I/O can be issued concurrently. 2. snapshot=on in bdrv_open(). We're in the middle of opening the image (both the original image and its temporary overlay), so there can't be any I/O in flight yet. 3. Mirroring. bdrv_drain() is already used on the source device so that the mirror doesn't miss anything. However, the main loop runs between that and the bdrv_swap() (which is actually a bug, being addressed in another series), so there is a small window in which new I/O might be issued that would be in flight during bdrv_swap(). It is safer to just drain the request queue of both devices in bdrv_swap() instead of relying on callers to do the right thing. Signed-off-by: Kevin Wolf <kwolf@redhat.com> --- block.c | 6 ++++++ 1 file changed, 6 insertions(+)