diff mbox

[v2,11/11] block: Block "device IO" during bdrv_drain and bdrv_drain_all

Message ID 5553369C.9010907@redhat.com
State New
Headers show

Commit Message

Paolo Bonzini May 13, 2015, 11:33 a.m. UTC
On 13/05/2015 13:08, Fam Zheng wrote:
> > I think this isn't enough.  It's the callers of bdrv_drain and
> > bdrv_drain_all that need to block before drain and unblock before
> > aio_context_release.
> 
> Which callers do you mean? qmp_transaction is covered in this series.

All of them. :(

In some cases it may be unnecessary.  I'm thinking of bdrv_set_aio_context,
and mig_save_device_dirty, but that's probably the exception rather than
the rule.

In some cases, like bdrv_snapshot_delete, the change is trivial.

The only somewhat more complex case is probably block/mirror.c.  There,
the aio_context_release happens through block_job_sleep_ns or
qemu_coroutine_yield.  I guess the change would look something like
this sketch:



It can be the topic of a separate series.  But this patch brings a
false sense of security (either the blocker is unnecessary, or it
needs to last after bdrv_drain returns), so I think it should be
dropped.

Paolo

Comments

Fam Zheng May 13, 2015, 3:17 p.m. UTC | #1
On Wed, 05/13 13:33, Paolo Bonzini wrote:
> 
> 
> On 13/05/2015 13:08, Fam Zheng wrote:
> > > I think this isn't enough.  It's the callers of bdrv_drain and
> > > bdrv_drain_all that need to block before drain and unblock before
> > > aio_context_release.
> > 
> > Which callers do you mean? qmp_transaction is covered in this series.
> 
> All of them. :(
> 
> In some cases it may be unnecessary.  I'm thinking of bdrv_set_aio_context,
> and mig_save_device_dirty, but that's probably the exception rather than
> the rule.
> 
> In some cases, like bdrv_snapshot_delete, the change is trivial.
> 
> The only somewhat more complex case is probably block/mirror.c.  There,
> the aio_context_release happens through block_job_sleep_ns or
> qemu_coroutine_yield.  I guess the change would look something like
> this sketch:
> 
> diff --git a/block/mirror.c b/block/mirror.c
> index 58f391a..dcfede0 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -513,16 +513,29 @@ static void coroutine_fn mirror_run(void *opaque)
>  
>          if (cnt == 0 && should_complete) {
>              /* The dirty bitmap is not updated while operations are pending.
> -             * If we're about to exit, wait for pending operations before
> -             * calling bdrv_get_dirty_count(bs), or we may exit while the
> +             * If we're about to exit, wait for pending operations and
> +             * recheck bdrv_get_dirty_count(bs), or we may exit while the
>               * source has dirty data to copy!
>               *
>               * Note that I/O can be submitted by the guest while
> -             * mirror_populate runs.
> +             * the rest of mirror_populate runs, but we lock it out here.
>               */
>              trace_mirror_before_drain(s, cnt);
> +
> +            // ... block ...
>              bdrv_drain(bs);
>              cnt = bdrv_get_dirty_count(s->dirty_bitmap);
> +
> +            if (cnt == 0) {
> +                /* The two disks are in sync.  Exit and report successful
> +                 * completion.
> +                 */
> +                assert(s->synced && QLIST_EMPTY(&bs->tracked_requests));
> +                s->common.cancelled = false;
> +                break; // ... and unblock somewhere else...
> +            }
> +
> +            // ... unblock ...
>          }
>  
>          ret = 0;
> @@ -535,13 +549,6 @@ static void coroutine_fn mirror_run(void *opaque)
>          } else if (!should_complete) {
>              delay_ns = (s->in_flight == 0 && cnt == 0 ? SLICE_TIME : 0);
>              block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, delay_ns);
> -        } else if (cnt == 0) {
> -            /* The two disks are in sync.  Exit and report successful
> -             * completion.
> -             */
> -            assert(QLIST_EMPTY(&bs->tracked_requests));
> -            s->common.cancelled = false;
> -            break;
>          }
>          last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
>      }
> 
> 
> It can be the topic of a separate series.  But this patch brings a
> false sense of security (either the blocker is unnecessary, or it
> needs to last after bdrv_drain returns), so I think it should be
> dropped.

Doesn't this let bdrv_drain_all return when virtio-blk-dataplane is having high
workload, in places where you say "the blocker is unnecessary"?

Fam
Paolo Bonzini May 13, 2015, 3:25 p.m. UTC | #2
On 13/05/2015 17:17, Fam Zheng wrote:
>> > 
>> > It can be the topic of a separate series.  But this patch brings a
>> > false sense of security (either the blocker is unnecessary, or it
>> > needs to last after bdrv_drain returns), so I think it should be
>> > dropped.
> Doesn't this let bdrv_drain_all return when virtio-blk-dataplane is having high
> workload, in places where you say "the blocker is unnecessary"?

Yes, you're right.  Please document it in the commit message and the
code, it's tricky.

Paolo
Fam Zheng May 14, 2015, 3:02 a.m. UTC | #3
On Wed, 05/13 17:25, Paolo Bonzini wrote:
> 
> 
> On 13/05/2015 17:17, Fam Zheng wrote:
> >> > 
> >> > It can be the topic of a separate series.  But this patch brings a
> >> > false sense of security (either the blocker is unnecessary, or it
> >> > needs to last after bdrv_drain returns), so I think it should be
> >> > dropped.
> > Doesn't this let bdrv_drain_all return when virtio-blk-dataplane is having high
> > workload, in places where you say "the blocker is unnecessary"?
> 
> Yes, you're right.  Please document it in the commit message and the
> code, it's tricky.

OK, will do it.

Fam
diff mbox

Patch

diff --git a/block/mirror.c b/block/mirror.c
index 58f391a..dcfede0 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -513,16 +513,29 @@  static void coroutine_fn mirror_run(void *opaque)
 
         if (cnt == 0 && should_complete) {
             /* The dirty bitmap is not updated while operations are pending.
-             * If we're about to exit, wait for pending operations before
-             * calling bdrv_get_dirty_count(bs), or we may exit while the
+             * If we're about to exit, wait for pending operations and
+             * recheck bdrv_get_dirty_count(bs), or we may exit while the
              * source has dirty data to copy!
              *
              * Note that I/O can be submitted by the guest while
-             * mirror_populate runs.
+             * the rest of mirror_populate runs, but we lock it out here.
              */
             trace_mirror_before_drain(s, cnt);
+
+            // ... block ...
             bdrv_drain(bs);
             cnt = bdrv_get_dirty_count(s->dirty_bitmap);
+
+            if (cnt == 0) {
+                /* The two disks are in sync.  Exit and report successful
+                 * completion.
+                 */
+                assert(s->synced && QLIST_EMPTY(&bs->tracked_requests));
+                s->common.cancelled = false;
+                break; // ... and unblock somewhere else...
+            }
+
+            // ... unblock ...
         }
 
         ret = 0;
@@ -535,13 +549,6 @@  static void coroutine_fn mirror_run(void *opaque)
         } else if (!should_complete) {
             delay_ns = (s->in_flight == 0 && cnt == 0 ? SLICE_TIME : 0);
             block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, delay_ns);
-        } else if (cnt == 0) {
-            /* The two disks are in sync.  Exit and report successful
-             * completion.
-             */
-            assert(QLIST_EMPTY(&bs->tracked_requests));
-            s->common.cancelled = false;
-            break;
         }
         last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
     }