diff mbox

[v4] Do not delete BlockDriverState when deleting the drive

Message ID 20110330015147.GG5445@us.ibm.com
State New
Headers show

Commit Message

Ryan Harper March 30, 2011, 1:51 a.m. UTC
When removing a drive from the host-side via drive_del we currently have
the following path:

drive_del
qemu_aio_flush()
bdrv_close()    // zaps bs->drv, which makes any subsequent I/O get
                // dropped.  Works as designed
drive_uninit()
bdrv_delete()   // frees the bs.  Since the device is still connected to
                // bs, any subsequent I/O is a use-after-free.

The value of bs->drv becomes unpredictable on free.  As long as it
remains null, I/O still gets dropped, however it could become non-null
at any point after the free resulting SEGVs or other QEMU state
corruption.

To resolve this issue as simply as possible, we can chose to not
actually delete the BlockDriverState pointer.  Since bdrv_close()
handles setting the drv pointer to NULL, we just need to remove the
BlockDriverState from the QLIST that is used to enumerate the block
devices.  This is currently handled within bdrv_delete, so move this
into its own function, bdrv_make_anon().

The result is that we can now invoke drive_del, this closes the file
descriptors and sets BlockDriverState->drv to NULL which prevents futher
IO to the device, and since we do not free BlockDriverState, we don't
have to worry about the copy retained in the block devices.

We also don't attempt to remove the qdev property since we are no longer
deleting the BlockDriverState on drives with associated drives.  This
also allows for removing Drives with no devices associated either.

Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Ryan Harper <ryanh@us.ibm.com>
---
v3->v4
 - add back missing nulling of device_name in v2
 - use drive_uninit() when removing drive with no peer
 - align commit message on 72 char boundary
v2->v3
 - Update drive_del use after free description 
 - s/bdrv_remove/bdrv_make_anon/g
 - Don't remove qdev property since we don't delete bs any more
 - If (bs->peer) bdrv_make_anon else bdrv_delete to handle removing
   drives with no device.
v1->v2
  - NULL bs->device_name after removing from list to prevent
    second removal.

 block.c    |   14 +++++++++++---
 block.h    |    1 +
 blockdev.c |   25 ++++++++-----------------
 3 files changed, 20 insertions(+), 20 deletions(-)

Comments

Markus Armbruster March 30, 2011, 7:36 a.m. UTC | #1
[Cc: Justin, because I feel it should go into stable-0.14 as well]

Ryan Harper <ryanh@us.ibm.com> writes:

> When removing a drive from the host-side via drive_del we currently have
> the following path:
>
> drive_del
> qemu_aio_flush()
> bdrv_close()    // zaps bs->drv, which makes any subsequent I/O get
>                 // dropped.  Works as designed
> drive_uninit()
> bdrv_delete()   // frees the bs.  Since the device is still connected to
>                 // bs, any subsequent I/O is a use-after-free.
>
> The value of bs->drv becomes unpredictable on free.  As long as it
> remains null, I/O still gets dropped, however it could become non-null
> at any point after the free resulting SEGVs or other QEMU state
> corruption.
>
> To resolve this issue as simply as possible, we can chose to not
> actually delete the BlockDriverState pointer.  Since bdrv_close()
> handles setting the drv pointer to NULL, we just need to remove the
> BlockDriverState from the QLIST that is used to enumerate the block
> devices.  This is currently handled within bdrv_delete, so move this
> into its own function, bdrv_make_anon().
>
> The result is that we can now invoke drive_del, this closes the file
> descriptors and sets BlockDriverState->drv to NULL which prevents futher
> IO to the device, and since we do not free BlockDriverState, we don't
> have to worry about the copy retained in the block devices.
>
> We also don't attempt to remove the qdev property since we are no longer
> deleting the BlockDriverState on drives with associated drives.  This
> also allows for removing Drives with no devices associated either.
>
> Reported-by: Markus Armbruster <armbru@redhat.com>
> Signed-off-by: Ryan Harper <ryanh@us.ibm.com>

Acked-by: Markus Armbruster <armbru@redhat.com>
Kevin Wolf March 31, 2011, 8:03 a.m. UTC | #2
Am 30.03.2011 03:51, schrieb Ryan Harper:
> When removing a drive from the host-side via drive_del we currently have
> the following path:
> 
> drive_del
> qemu_aio_flush()
> bdrv_close()    // zaps bs->drv, which makes any subsequent I/O get
>                 // dropped.  Works as designed
> drive_uninit()
> bdrv_delete()   // frees the bs.  Since the device is still connected to
>                 // bs, any subsequent I/O is a use-after-free.
> 
> The value of bs->drv becomes unpredictable on free.  As long as it
> remains null, I/O still gets dropped, however it could become non-null
> at any point after the free resulting SEGVs or other QEMU state
> corruption.
> 
> To resolve this issue as simply as possible, we can chose to not
> actually delete the BlockDriverState pointer.  Since bdrv_close()
> handles setting the drv pointer to NULL, we just need to remove the
> BlockDriverState from the QLIST that is used to enumerate the block
> devices.  This is currently handled within bdrv_delete, so move this
> into its own function, bdrv_make_anon().
> 
> The result is that we can now invoke drive_del, this closes the file
> descriptors and sets BlockDriverState->drv to NULL which prevents futher
> IO to the device, and since we do not free BlockDriverState, we don't
> have to worry about the copy retained in the block devices.
> 
> We also don't attempt to remove the qdev property since we are no longer
> deleting the BlockDriverState on drives with associated drives.  This
> also allows for removing Drives with no devices associated either.
> 
> Reported-by: Markus Armbruster <armbru@redhat.com>
> Signed-off-by: Ryan Harper <ryanh@us.ibm.com>

Thanks, applied to the block branch.

Kevin
diff mbox

Patch

diff --git a/block.c b/block.c
index c8e2f97..c93ec6d 100644
--- a/block.c
+++ b/block.c
@@ -697,14 +697,22 @@  void bdrv_close_all(void)
     }
 }
 
+/* make a BlockDriverState anonymous by removing from bdrv_state list.
+   Also, NULL terminate the device_name to prevent double remove */
+void bdrv_make_anon(BlockDriverState *bs)
+{
+    if (bs->device_name[0] != '\0') {
+        QTAILQ_REMOVE(&bdrv_states, bs, list);
+    }
+    bs->device_name[0] = '\0';
+}
+
 void bdrv_delete(BlockDriverState *bs)
 {
     assert(!bs->peer);
 
     /* remove from list, if necessary */
-    if (bs->device_name[0] != '\0') {
-        QTAILQ_REMOVE(&bdrv_states, bs, list);
-    }
+    bdrv_make_anon(bs);
 
     bdrv_close(bs);
     if (bs->file != NULL) {
diff --git a/block.h b/block.h
index 5d78fc0..52e9cad 100644
--- a/block.h
+++ b/block.h
@@ -66,6 +66,7 @@  int bdrv_create(BlockDriver *drv, const char* filename,
     QEMUOptionParameter *options);
 int bdrv_create_file(const char* filename, QEMUOptionParameter *options);
 BlockDriverState *bdrv_new(const char *device_name);
+void bdrv_make_anon(BlockDriverState *bs);
 void bdrv_delete(BlockDriverState *bs);
 int bdrv_file_open(BlockDriverState **pbs, const char *filename, int flags);
 int bdrv_open(BlockDriverState *bs, const char *filename, int flags,
diff --git a/blockdev.c b/blockdev.c
index ecf2252..c810b16 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -737,8 +737,6 @@  int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data)
 {
     const char *id = qdict_get_str(qdict, "id");
     BlockDriverState *bs;
-    BlockDriverState **ptr;
-    Property *prop;
 
     bs = bdrv_find(id);
     if (!bs) {
@@ -755,24 +753,17 @@  int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data)
     bdrv_flush(bs);
     bdrv_close(bs);
 
-    /* clean up guest state from pointing to host resource by
-     * finding and removing DeviceState "drive" property */
+    /* if we have a device associated with this BlockDriverState (bs->peer)
+     * then we need to make the drive anonymous until the device
+     * can be removed.  If this is a drive with no device backing
+     * then we can just get rid of the block driver state right here.
+     */
     if (bs->peer) {
-        for (prop = bs->peer->info->props; prop && prop->name; prop++) {
-            if (prop->info->type == PROP_TYPE_DRIVE) {
-                ptr = qdev_get_prop_ptr(bs->peer, prop);
-                if (*ptr == bs) {
-                    bdrv_detach(bs, bs->peer);
-                    *ptr = NULL;
-                    break;
-                }
-            }
-        }
+        bdrv_make_anon(bs);
+    } else {
+        drive_uninit(drive_get_by_blockdev(bs));
     }
 
-    /* clean up host side */
-    drive_uninit(drive_get_by_blockdev(bs));
-
     return 0;
 }