diff mbox

[v2,02/13] block: Introduce bdrv_lock and bdrv_unlock API

Message ID 1433215322-23529-3-git-send-email-famz@redhat.com
State New
Headers show

Commit Message

Fam Zheng June 2, 2015, 3:21 a.m. UTC
For various purposes, BDS users call bdrv_drain or bdrv_drain_all to make sure
there are no pending requests during a series of operations on the BDS. But in
the middle of operations, the caller may 1) yield from a coroutine (mirror_run);
2) defer the next part of work to a BH (mirror_run); 3) call nested aio_poll
(qmp_transaction); etc..

This lock/unlock API is introduced to help assure above complications won't
spoil the purpose of the bdrv_drain(): bdrv_lock should help quiesce other
readers and writers in the beginning of such operations, and bdrv_unlock should
resume the blocked requests.

This patch only adds the API, so that some bdrv_drain() callers can
already switch to it without behavior change. Fleshing up of the
functions will follow.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block.c                   |  4 ++++
 block/io.c                | 12 ++++++++++++
 include/block/block.h     | 14 ++++++++++++++
 include/block/block_int.h |  2 ++
 4 files changed, 32 insertions(+)

Comments

Stefan Hajnoczi June 16, 2015, 4:07 p.m. UTC | #1
On Tue, Jun 02, 2015 at 11:21:51AM +0800, Fam Zheng wrote:
> +/**
> + * bdrv_lock:
> + *
> + * Begin a temporary exclusive accessing by locking the BDS.
> + */
> +void bdrv_lock(BlockDriverState *bs);
> +
> +/**
> + * bdrv_unlock:
> + *
> + * End a exclusive accessing.
> + */
> +void bdrv_unlock(BlockDriverState *bs);

This documentation is missing important points:

1. Does AioContext need to be held by the caller?  (Yes)

2. Is this about thread safety?  (No, it's about exclusive access to a
   BDS *within* the AioContext.)

Maybe bdrv_begin_exclusive() and bdrv_end_exclusive() are clearer names?
Fam Zheng June 24, 2015, 2:47 a.m. UTC | #2
On Tue, 06/16 17:07, Stefan Hajnoczi wrote:
> On Tue, Jun 02, 2015 at 11:21:51AM +0800, Fam Zheng wrote:
> > +/**
> > + * bdrv_lock:
> > + *
> > + * Begin a temporary exclusive accessing by locking the BDS.
> > + */
> > +void bdrv_lock(BlockDriverState *bs);
> > +
> > +/**
> > + * bdrv_unlock:
> > + *
> > + * End a exclusive accessing.
> > + */
> > +void bdrv_unlock(BlockDriverState *bs);
> 
> This documentation is missing important points:
> 
> 1. Does AioContext need to be held by the caller?  (Yes)

Yes.

> 
> 2. Is this about thread safety?  (No, it's about exclusive access to a
>    BDS *within* the AioContext.)

As it has to quiesce iothreads as well (for now it's even more urgent than
exclusive access within the same AioContext), I'd rather take it as yes.

Paolo, please correct me.

BTW, is there any semantics in here that we can use for multiqueue block layer?

Fam
Fam Zheng June 24, 2015, 3:04 a.m. UTC | #3
On Wed, 06/24 10:47, Fam Zheng wrote:
> On Tue, 06/16 17:07, Stefan Hajnoczi wrote:
> > On Tue, Jun 02, 2015 at 11:21:51AM +0800, Fam Zheng wrote:
> > > +/**
> > > + * bdrv_lock:
> > > + *
> > > + * Begin a temporary exclusive accessing by locking the BDS.
> > > + */
> > > +void bdrv_lock(BlockDriverState *bs);
> > > +
> > > +/**
> > > + * bdrv_unlock:
> > > + *
> > > + * End a exclusive accessing.
> > > + */
> > > +void bdrv_unlock(BlockDriverState *bs);
> > 
> > This documentation is missing important points:
> > 
> > 1. Does AioContext need to be held by the caller?  (Yes)
> 
> Yes.
> 
> > 
> > 2. Is this about thread safety?  (No, it's about exclusive access to a
> >    BDS *within* the AioContext.)
> 
> As it has to quiesce iothreads as well (for now it's even more urgent than

s/iothreads/ioeventfd/

> exclusive access within the same AioContext), I'd rather take it as yes.
> 
> Paolo, please correct me.
> 
> BTW, is there any semantics in here that we can use for multiqueue block layer?
> 
> Fam
> 
>
Paolo Bonzini June 24, 2015, 9:14 a.m. UTC | #4
On 24/06/2015 04:47, Fam Zheng wrote:
>> > 2. Is this about thread safety?  (No, it's about exclusive access to a
>> >    BDS *within* the AioContext.)
> As it has to quiesce iothreads as well (for now it's even more urgent than
> exclusive access within the same AioContext), I'd rather take it as yes.

For now it's a "no", because there are no races between threads: the
main thread has acquired the AioContext and cut away the iothread.

However, as we move towards fine-grained AioContext critical sections,
it will become a "yes".

Paolo
Stefan Hajnoczi June 24, 2015, 9:35 a.m. UTC | #5
On Wed, Jun 24, 2015 at 10:47:47AM +0800, Fam Zheng wrote:
> On Tue, 06/16 17:07, Stefan Hajnoczi wrote:
> > On Tue, Jun 02, 2015 at 11:21:51AM +0800, Fam Zheng wrote:
> > 2. Is this about thread safety?  (No, it's about exclusive access to a
> >    BDS *within* the AioContext.)
> 
> As it has to quiesce iothreads as well (for now it's even more urgent than
> exclusive access within the same AioContext), I'd rather take it as yes.

This mechanism is not about threads and it is also not thread-safe (the
caller must acquire AioContext themselves).

The mechanism is about notifying the users of a drive that no requests
should be submitted.

> BTW, is there any semantics in here that we can use for multiqueue block layer?

The callback to remove/add ioeventfd is needed by multiqueue in the same
way as dataplane.

I think the actual multiqueue I/O will need to be more fine-grained
because the goal is to process I/O requests in parallel and
independently.  Multiqueue should not require exclusive access (which
this mechanism provides) - that would defeat the point of multiqueue.

Stefan
diff mbox

Patch

diff --git a/block.c b/block.c
index b589506..0e25dbd 100644
--- a/block.c
+++ b/block.c
@@ -1716,6 +1716,7 @@  void bdrv_close(BlockDriverState *bs)
 {
     BdrvAioNotifier *ban, *ban_next;
 
+    assert(bs->lock_level == 0);
     if (bs->job) {
         block_job_cancel_sync(bs->job);
     }
@@ -1850,6 +1851,9 @@  static void bdrv_move_feature_fields(BlockDriverState *bs_dest,
     bs_dest->device_list = bs_src->device_list;
     bs_dest->blk = bs_src->blk;
 
+    /* lock */
+    bs_dest->lock_level = bs_src->lock_level;
+
     memcpy(bs_dest->op_blockers, bs_src->op_blockers,
            sizeof(bs_dest->op_blockers));
 }
diff --git a/block/io.c b/block/io.c
index e394d92..49060e5 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2601,3 +2601,15 @@  void bdrv_flush_io_queue(BlockDriverState *bs)
         bdrv_flush_io_queue(bs->file);
     }
 }
+
+void bdrv_lock(BlockDriverState *bs)
+{
+    bs->lock_level++;
+    bdrv_drain(bs);
+}
+
+void bdrv_unlock(BlockDriverState *bs)
+{
+    bs->lock_level--;
+    assert(bs->lock_level >= 0);
+}
diff --git a/include/block/block.h b/include/block/block.h
index f7680b6..b49194d 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -592,6 +592,20 @@  void bdrv_io_plug(BlockDriverState *bs);
 void bdrv_io_unplug(BlockDriverState *bs);
 void bdrv_flush_io_queue(BlockDriverState *bs);
 
+/**
+ * bdrv_lock:
+ *
+ * Begin a temporary exclusive accessing by locking the BDS.
+ */
+void bdrv_lock(BlockDriverState *bs);
+
+/**
+ * bdrv_unlock:
+ *
+ * End a exclusive accessing.
+ */
+void bdrv_unlock(BlockDriverState *bs);
+
 BlockAcctStats *bdrv_get_stats(BlockDriverState *bs);
 
 #endif
diff --git a/include/block/block_int.h b/include/block/block_int.h
index f004378..9f75d46 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -433,6 +433,8 @@  struct BlockDriverState {
     /* threshold limit for writes, in bytes. "High water mark". */
     uint64_t write_threshold_offset;
     NotifierWithReturn write_threshold_notifier;
+
+    int lock_level;
 };