Patchwork [RFC,08/13] block: drop bdrv_get_aio_context()

login
register
mail settings
Submitter Stefan Hajnoczi
Date June 14, 2013, 9:48 a.m.
Message ID <1371203313-26490-9-git-send-email-stefanha@redhat.com>
Download mbox | patch
Permalink /patch/251310/
State New
Headers show

Comments

Stefan Hajnoczi - June 14, 2013, 9:48 a.m.
Associating a BlockDriverState with a single AioContext is not flexible
enough.  Once we make BlockDriverState thread-safe, it will be possible
to call bdrv_*() functions from multiple event loops.

Use the thread-local AioContext pointer instead of
bdrv_get_aio_context().

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block.c                   | 6 ------
 block/raw-posix.c         | 4 ++--
 block/raw-win32.c         | 2 +-
 include/block/block_int.h | 7 -------
 4 files changed, 3 insertions(+), 16 deletions(-)
Paolo Bonzini - June 14, 2013, 2:12 p.m.
Il 14/06/2013 05:48, Stefan Hajnoczi ha scritto:
> Associating a BlockDriverState with a single AioContext is not flexible
> enough.  Once we make BlockDriverState thread-safe, it will be possible
> to call bdrv_*() functions from multiple event loops.

I'm afraid that this is trading some pain now (converting
qemu_bh_new/qemu_set_fd_handler to aio_bh_new/aio_set_fd_handler) for
more pain later (having to make BDS thread-safe).  There aren't that
many (~40) in block layer code.

Making BlockDriverState thread-safe is hard, it is much simpler to run
all the BlockDriverState code in the AioContext thread itself.

There are some things that cannot (basically monitor commands and other
places that are currently using bdrv_drain_all) but they can simply take
a "big AioContext lock".

Paolo
Stefan Hajnoczi - June 17, 2013, 2:57 p.m.
On Fri, Jun 14, 2013 at 10:12:00AM -0400, Paolo Bonzini wrote:
> Il 14/06/2013 05:48, Stefan Hajnoczi ha scritto:
> > Associating a BlockDriverState with a single AioContext is not flexible
> > enough.  Once we make BlockDriverState thread-safe, it will be possible
> > to call bdrv_*() functions from multiple event loops.
> 
> I'm afraid that this is trading some pain now (converting
> qemu_bh_new/qemu_set_fd_handler to aio_bh_new/aio_set_fd_handler) for
> more pain later (having to make BDS thread-safe).  There aren't that
> many (~40) in block layer code.
> 
> Making BlockDriverState thread-safe is hard, it is much simpler to run
> all the BlockDriverState code in the AioContext thread itself.
> 
> There are some things that cannot (basically monitor commands and other
> places that are currently using bdrv_drain_all) but they can simply take
> a "big AioContext lock".

I don't like a big AioContext lock that stops the event loop because we
need to support a M:N device:iothread model where stopping an event loop
means artificially stopping other devices too.

Maybe it's workable though since the only commands that would take an
AioContext lock are rare and shouldn't impact performance.  I guess then
it comes down to robustness if a hung NFS mount can affect the entire VM
instead of just hanging an emulated disk device.

Still, I have tried the lock approach and agree it is hard.  I'll
experiment some more and hopefully come up with something that handles
block jobs and the run-time NBD server.

Stefan
Paolo Bonzini - June 17, 2013, 3:03 p.m.
Il 17/06/2013 16:57, Stefan Hajnoczi ha scritto:
> On Fri, Jun 14, 2013 at 10:12:00AM -0400, Paolo Bonzini wrote:
>> Il 14/06/2013 05:48, Stefan Hajnoczi ha scritto:
>>> Associating a BlockDriverState with a single AioContext is not flexible
>>> enough.  Once we make BlockDriverState thread-safe, it will be possible
>>> to call bdrv_*() functions from multiple event loops.
>>
>> I'm afraid that this is trading some pain now (converting
>> qemu_bh_new/qemu_set_fd_handler to aio_bh_new/aio_set_fd_handler) for
>> more pain later (having to make BDS thread-safe).  There aren't that
>> many (~40) in block layer code.
>>
>> Making BlockDriverState thread-safe is hard, it is much simpler to run
>> all the BlockDriverState code in the AioContext thread itself.
>>
>> There are some things that cannot (basically monitor commands and other
>> places that are currently using bdrv_drain_all) but they can simply take
>> a "big AioContext lock".
> 
> I don't like a big AioContext lock that stops the event loop because we
> need to support a M:N device:iothread model where stopping an event loop
> means artificially stopping other devices too.

The event loop would still run via aio_wait(), just in the monitor
thread.  However it would stop the ioeventfd, through a mechanism such
as the one you added in this thread or something like that.

> Maybe it's workable though since the only commands that would take an
> AioContext lock are rare and shouldn't impact performance.

Yes.

> I guess then
> it comes down to robustness if a hung NFS mount can affect the entire VM
> instead of just hanging an emulated disk device.

That would only be a hung NFS mount while running something that drains
a BDS, no?  It would hang the monitor, but it wouldn't be a regression
compared to what we have now obviously.

Paolo
Stefan Hajnoczi - June 18, 2013, 9:29 a.m.
On Mon, Jun 17, 2013 at 05:03:12PM +0200, Paolo Bonzini wrote:
> Il 17/06/2013 16:57, Stefan Hajnoczi ha scritto:
> > On Fri, Jun 14, 2013 at 10:12:00AM -0400, Paolo Bonzini wrote:
> >> Il 14/06/2013 05:48, Stefan Hajnoczi ha scritto:
> > I guess then
> > it comes down to robustness if a hung NFS mount can affect the entire VM
> > instead of just hanging an emulated disk device.
> 
> That would only be a hung NFS mount while running something that drains
> a BDS, no?  It would hang the monitor, but it wouldn't be a regression
> compared to what we have now obviously.

Yes, no regression from what we have today.

Stefan

Patch

diff --git a/block.c b/block.c
index a3323b2..d986fc8 100644
--- a/block.c
+++ b/block.c
@@ -4582,9 +4582,3 @@  out:
         bdrv_delete(bs);
     }
 }
-
-AioContext *bdrv_get_aio_context(BlockDriverState *bs)
-{
-    /* Currently BlockDriverState always uses the main loop AioContext */
-    return qemu_get_aio_context();
-}
diff --git a/block/raw-posix.c b/block/raw-posix.c
index c0ccf27..821c19f 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -796,7 +796,7 @@  static BlockDriverAIOCB *paio_submit(BlockDriverState *bs, int fd,
     acb->aio_offset = sector_num * 512;
 
     trace_paio_submit(acb, opaque, sector_num, nb_sectors, type);
-    pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
+    pool = aio_get_thread_pool(*get_thread_aio_context());
     return thread_pool_submit_aio(pool, aio_worker, acb, cb, opaque);
 }
 
@@ -1460,7 +1460,7 @@  static BlockDriverAIOCB *hdev_aio_ioctl(BlockDriverState *bs,
     acb->aio_offset = 0;
     acb->aio_ioctl_buf = buf;
     acb->aio_ioctl_cmd = req;
-    pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
+    pool = aio_get_thread_pool(*get_thread_aio_context());
     return thread_pool_submit_aio(pool, aio_worker, acb, cb, opaque);
 }
 
diff --git a/block/raw-win32.c b/block/raw-win32.c
index 7c03b6d..b5e59a8 100644
--- a/block/raw-win32.c
+++ b/block/raw-win32.c
@@ -158,7 +158,7 @@  static BlockDriverAIOCB *paio_submit(BlockDriverState *bs, HANDLE hfile,
     acb->aio_offset = sector_num * 512;
 
     trace_paio_submit(acb, opaque, sector_num, nb_sectors, type);
-    pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
+    pool = aio_get_thread_pool(*get_thread_aio_context());
     return thread_pool_submit_aio(pool, aio_worker, acb, cb, opaque);
 }
 
diff --git a/include/block/block_int.h b/include/block/block_int.h
index ba52247..88c8b52 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -298,13 +298,6 @@  int get_tmp_filename(char *filename, int size);
 void bdrv_set_io_limits(BlockDriverState *bs,
                         BlockIOLimit *io_limits);
 
-/**
- * bdrv_get_aio_context:
- *
- * Returns: the currently bound #AioContext
- */
-AioContext *bdrv_get_aio_context(BlockDriverState *bs);
-
 #ifdef _WIN32
 int is_windows_drive(const char *filename);
 #endif