Patchwork [2/2] Fix Block Hotplug race with drive_unplug()

login
register
mail settings
Submitter Ryan Harper
Date Oct. 18, 2010, 10:17 p.m.
Message ID <1287440237-14675-3-git-send-email-ryanh@us.ibm.com>
Download mbox | patch
Permalink /patch/68245/
State New
Headers show

Comments

Ryan Harper - Oct. 18, 2010, 10:17 p.m.
Block hot unplug is racy since the guest is required to acknowlege the ACPI
unplug event; this may not happen synchronously with the device removal command

This series aims to close a gap where by mgmt applications that assume the
block resource has been removed without confirming that the guest has
acknowledged the removal may re-assign the underlying device to a second guest
leading to data leakage.

This series introduces a new montor command to decouple asynchornous device
removal from restricting guest access to a block device.  We do this by creating
a new monitor command drive_unplug which maps to a bdrv_unplug() command which
does a bdrv_flush() and bdrv_close().  Once complete, subsequent IO is rejected
from the device and the guest will get IO errors but continue to function.

A subsequent device removal command can be issued to remove the device, to which
the guest may or maynot respond, but as long as the unplugged bit is set, no IO
will be sumbitted.

Signed-off-by: Ryan Harper <ryanh@us.ibm.com>
---
 block.c         |    6 ++++++
 block.h         |    1 +
 blockdev.c      |   26 ++++++++++++++++++++++++++
 blockdev.h      |    1 +
 hmp-commands.hx |   15 +++++++++++++++
 5 files changed, 49 insertions(+), 0 deletions(-)
Stefan Hajnoczi - Oct. 19, 2010, 10:45 a.m.
On Mon, Oct 18, 2010 at 11:17 PM, Ryan Harper <ryanh@us.ibm.com> wrote:
> Block hot unplug is racy since the guest is required to acknowlege the ACPI
> unplug event; this may not happen synchronously with the device removal command
>
> This series aims to close a gap where by mgmt applications that assume the
> block resource has been removed without confirming that the guest has
> acknowledged the removal may re-assign the underlying device to a second guest
> leading to data leakage.
>
> This series introduces a new montor command to decouple asynchornous device
> removal from restricting guest access to a block device.  We do this by creating
> a new monitor command drive_unplug which maps to a bdrv_unplug() command which
> does a bdrv_flush() and bdrv_close().  Once complete, subsequent IO is rejected
> from the device and the guest will get IO errors but continue to function.
>
> A subsequent device removal command can be issued to remove the device, to which
> the guest may or maynot respond, but as long as the unplugged bit is set, no IO
> will be sumbitted.
>
> Signed-off-by: Ryan Harper <ryanh@us.ibm.com>
> ---
>  block.c         |    6 ++++++
>  block.h         |    1 +
>  blockdev.c      |   26 ++++++++++++++++++++++++++
>  blockdev.h      |    1 +
>  hmp-commands.hx |   15 +++++++++++++++
>  5 files changed, 49 insertions(+), 0 deletions(-)
>
> diff --git a/block.c b/block.c
> index a19374d..9fedb27 100644
> --- a/block.c
> +++ b/block.c
> @@ -1328,6 +1328,12 @@ void bdrv_set_removable(BlockDriverState *bs, int removable)
>     }
>  }
>
> +void bdrv_unplug(BlockDriverState *bs)
> +{
> +    bdrv_flush(bs);
> +    bdrv_close(bs);

bdrv_flush() does not wait for pending aio requests to complete.
bdrv_close() does not wait either.

A VM with a qcow2 image file and pending aio requests could
bdrv_unplug() and free the qcow2 state before aio completions occur.
If a completion is handled after bdrv_close(), the qcow2 in-memory
state has been freed and we get memory corruption or a crash.

I think the solution is to use qemu_aio_flush() before bdrv_flush().
I waits until all pending aio requests have been completed.

Stefan

Patch

diff --git a/block.c b/block.c
index a19374d..9fedb27 100644
--- a/block.c
+++ b/block.c
@@ -1328,6 +1328,12 @@  void bdrv_set_removable(BlockDriverState *bs, int removable)
     }
 }
 
+void bdrv_unplug(BlockDriverState *bs)
+{
+    bdrv_flush(bs);
+    bdrv_close(bs);
+}
+
 int bdrv_is_removable(BlockDriverState *bs)
 {
     return bs->removable;
diff --git a/block.h b/block.h
index 5f64380..732f63e 100644
--- a/block.h
+++ b/block.h
@@ -171,6 +171,7 @@  void bdrv_set_on_error(BlockDriverState *bs, BlockErrorAction on_read_error,
                        BlockErrorAction on_write_error);
 BlockErrorAction bdrv_get_on_error(BlockDriverState *bs, int is_read);
 void bdrv_set_removable(BlockDriverState *bs, int removable);
+void bdrv_unplug(BlockDriverState *bs);
 int bdrv_is_removable(BlockDriverState *bs);
 int bdrv_is_read_only(BlockDriverState *bs);
 int bdrv_is_sg(BlockDriverState *bs);
diff --git a/blockdev.c b/blockdev.c
index a00b3fa..da0b256 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -609,3 +609,29 @@  int do_change_block(Monitor *mon, const char *device,
     }
     return monitor_read_bdrv_key_start(mon, bs, NULL, NULL);
 }
+
+int do_drive_unplug(Monitor *mon, const QDict *qdict, QObject **ret_data)
+{
+    DriveInfo *dinfo;
+    BlockDriverState *bs;
+    const char *id;
+
+    if (!qdict_haskey(qdict, "id")) {
+        qerror_report(QERR_MISSING_PARAMETER, "id");
+        return -1;
+    }
+
+    id = qdict_get_str(qdict, "id");
+    dinfo = drive_get_by_id(id);
+    if (!dinfo) {
+        qerror_report(QERR_DEVICE_NOT_FOUND, id);
+        return -1;
+    }
+
+    /* mark block device unplugged */
+    bs = dinfo->bdrv;
+    bdrv_unplug(bs);
+
+    return 0;
+}
+ 
diff --git a/blockdev.h b/blockdev.h
index 19c6915..ecb9ac8 100644
--- a/blockdev.h
+++ b/blockdev.h
@@ -52,5 +52,6 @@  int do_eject(Monitor *mon, const QDict *qdict, QObject **ret_data);
 int do_block_set_passwd(Monitor *mon, const QDict *qdict, QObject **ret_data);
 int do_change_block(Monitor *mon, const char *device,
                     const char *filename, const char *fmt);
+int do_drive_unplug(Monitor *mon, const QDict *qdict, QObject **ret_data);
 
 #endif
diff --git a/hmp-commands.hx b/hmp-commands.hx
index 81999aa..7a32a2e 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -68,6 +68,21 @@  Eject a removable medium (use -f to force it).
 ETEXI
 
     {
+        .name       = "drive_unplug",
+        .args_type  = "id:s",
+        .params     = "device",
+        .help       = "unplug block device",
+        .user_print = monitor_user_noop,
+        .mhandler.cmd_new = do_drive_unplug,
+    },
+
+STEXI
+@item unplug @var{device}
+@findex unplug
+Unplug block device.
+ETEXI
+
+    {
         .name       = "change",
         .args_type  = "device:B,target:F,arg:s?",
         .params     = "device filename [format]",