diff mbox

[v13,07/14] block/mirror: Improve progress report

Message ID 1413982283-10186-8-git-send-email-mreitz@redhat.com
State New
Headers show

Commit Message

Max Reitz Oct. 22, 2014, 12:51 p.m. UTC
Instead of taking the total length of the block device as the block
job's length, use the number of dirty sectors. The progress is now the
number of sectors mirrored to the target block device. Note that this
may result in the job's length increasing during operation, which is
however in fact desirable.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/mirror.c | 34 ++++++++++++++++++++++------------
 1 file changed, 22 insertions(+), 12 deletions(-)

Comments

Kevin Wolf Oct. 23, 2014, 10:52 a.m. UTC | #1
Am 22.10.2014 um 14:51 hat Max Reitz geschrieben:
> Instead of taking the total length of the block device as the block
> job's length, use the number of dirty sectors. The progress is now the
> number of sectors mirrored to the target block device. Note that this
> may result in the job's length increasing during operation, which is
> however in fact desirable.

More importantly, because it might surprise management tools, is that
the progress (as in offset/len) can actually decrease now.

I can't say whether that creates any problem, I'll rely on Eric's
Reviewed-by for that.

> Signed-off-by: Max Reitz <mreitz@redhat.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> ---
>  block/mirror.c | 34 ++++++++++++++++++++++------------
>  1 file changed, 22 insertions(+), 12 deletions(-)

> @@ -409,6 +416,12 @@ static void coroutine_fn mirror_run(void *opaque)
>          }
>  
>          cnt = bdrv_get_dirty_count(bs, s->dirty_bitmap);
> +        /* s->common.offset contains the number of bytes already processed so
> +         * far, cnt is the number of dirty sectors remaining and
> +         * s->sectors_in_flight is the number of sectors currently being
> +         * processed; together those are the current total operation length */
> +        s->common.len = s->common.offset +
> +                        (cnt + s->sectors_in_flight) * BDRV_SECTOR_SIZE;

Isn't s->sectors_in_flight still contained in cnt? If I understand
correctly, sectors are only marked as clean at the same time as
s->sectors_in_flight is decremented again.

Kevin
Max Reitz Oct. 23, 2014, 11:09 a.m. UTC | #2
On 2014-10-23 at 12:52, Kevin Wolf wrote:
> Am 22.10.2014 um 14:51 hat Max Reitz geschrieben:
>> Instead of taking the total length of the block device as the block
>> job's length, use the number of dirty sectors. The progress is now the
>> number of sectors mirrored to the target block device. Note that this
>> may result in the job's length increasing during operation, which is
>> however in fact desirable.
> More importantly, because it might surprise management tools, is that
> the progress (as in offset/len) can actually decrease now.
>
> I can't say whether that creates any problem, I'll rely on Eric's
> Reviewed-by for that.

Eric said, it would rather solve problems.

>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> Reviewed-by: Eric Blake <eblake@redhat.com>
>> ---
>>   block/mirror.c | 34 ++++++++++++++++++++++------------
>>   1 file changed, 22 insertions(+), 12 deletions(-)
>> @@ -409,6 +416,12 @@ static void coroutine_fn mirror_run(void *opaque)
>>           }
>>   
>>           cnt = bdrv_get_dirty_count(bs, s->dirty_bitmap);
>> +        /* s->common.offset contains the number of bytes already processed so
>> +         * far, cnt is the number of dirty sectors remaining and
>> +         * s->sectors_in_flight is the number of sectors currently being
>> +         * processed; together those are the current total operation length */
>> +        s->common.len = s->common.offset +
>> +                        (cnt + s->sectors_in_flight) * BDRV_SECTOR_SIZE;
> Isn't s->sectors_in_flight still contained in cnt? If I understand
> correctly, sectors are only marked as clean at the same time as
> s->sectors_in_flight is decremented again.

As far as I see, bdrv_reset_dirty() is called in mirror_iteration() 
right before s->sectors_in_flight is incremented.

Max
Kevin Wolf Oct. 23, 2014, 12:03 p.m. UTC | #3
Am 23.10.2014 um 13:09 hat Max Reitz geschrieben:
> On 2014-10-23 at 12:52, Kevin Wolf wrote:
> >Am 22.10.2014 um 14:51 hat Max Reitz geschrieben:
> >>Instead of taking the total length of the block device as the block
> >>job's length, use the number of dirty sectors. The progress is now the
> >>number of sectors mirrored to the target block device. Note that this
> >>may result in the job's length increasing during operation, which is
> >>however in fact desirable.
> >More importantly, because it might surprise management tools, is that
> >the progress (as in offset/len) can actually decrease now.
> >
> >I can't say whether that creates any problem, I'll rely on Eric's
> >Reviewed-by for that.
> 
> Eric said, it would rather solve problems.
> 
> >>Signed-off-by: Max Reitz <mreitz@redhat.com>
> >>Reviewed-by: Eric Blake <eblake@redhat.com>
> >>---
> >>  block/mirror.c | 34 ++++++++++++++++++++++------------
> >>  1 file changed, 22 insertions(+), 12 deletions(-)
> >>@@ -409,6 +416,12 @@ static void coroutine_fn mirror_run(void *opaque)
> >>          }
> >>          cnt = bdrv_get_dirty_count(bs, s->dirty_bitmap);
> >>+        /* s->common.offset contains the number of bytes already processed so
> >>+         * far, cnt is the number of dirty sectors remaining and
> >>+         * s->sectors_in_flight is the number of sectors currently being
> >>+         * processed; together those are the current total operation length */
> >>+        s->common.len = s->common.offset +
> >>+                        (cnt + s->sectors_in_flight) * BDRV_SECTOR_SIZE;
> >Isn't s->sectors_in_flight still contained in cnt? If I understand
> >correctly, sectors are only marked as clean at the same time as
> >s->sectors_in_flight is decremented again.
> 
> As far as I see, bdrv_reset_dirty() is called in mirror_iteration()
> right before s->sectors_in_flight is incremented.

You're right, I confused it with the other bitmap operations in
mirror_iteration_done(). So:

Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake Oct. 23, 2014, 1:10 p.m. UTC | #4
On 10/23/2014 04:52 AM, Kevin Wolf wrote:
> Am 22.10.2014 um 14:51 hat Max Reitz geschrieben:
>> Instead of taking the total length of the block device as the block
>> job's length, use the number of dirty sectors. The progress is now the
>> number of sectors mirrored to the target block device. Note that this
>> may result in the job's length increasing during operation, which is
>> however in fact desirable.
> 
> More importantly, because it might surprise management tools, is that
> the progress (as in offset/len) can actually decrease now.

Decreased progress is a GOOD thing - it tells management that the guest
is causing I/O faster than the mirroring can keep up with, and that it
might be time to turn on some throttling if the mirror is going to have
a chance at converging.  The old code didn't have any correlation to
progress complete vs work remaining, making it harder to accurately
estimate how long a job still has to go.

> 
> I can't say whether that creates any problem, I'll rely on Eric's
> Reviewed-by for that.

Libvirt has already documented that for all jobs (including block jobs),
progress is estimated by offset/total, but that total need not be
constant between two consecutive calls, precisely for the behavior that
this patch moves toward.
diff mbox

Patch

diff --git a/block/mirror.c b/block/mirror.c
index e8a43eb..2a1acfe 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -45,6 +45,7 @@  typedef struct MirrorBlockJob {
     int64_t sector_num;
     int64_t granularity;
     size_t buf_size;
+    int64_t bdev_length;
     unsigned long *cow_bitmap;
     BdrvDirtyBitmap *dirty_bitmap;
     HBitmapIter hbi;
@@ -54,6 +55,7 @@  typedef struct MirrorBlockJob {
 
     unsigned long *in_flight_bitmap;
     int in_flight;
+    int sectors_in_flight;
     int ret;
 } MirrorBlockJob;
 
@@ -87,6 +89,7 @@  static void mirror_iteration_done(MirrorOp *op, int ret)
     trace_mirror_iteration_done(s, op->sector_num, op->nb_sectors, ret);
 
     s->in_flight--;
+    s->sectors_in_flight -= op->nb_sectors;
     iov = op->qiov.iov;
     for (i = 0; i < op->qiov.niov; i++) {
         MirrorBuffer *buf = (MirrorBuffer *) iov[i].iov_base;
@@ -98,8 +101,11 @@  static void mirror_iteration_done(MirrorOp *op, int ret)
     chunk_num = op->sector_num / sectors_per_chunk;
     nb_chunks = op->nb_sectors / sectors_per_chunk;
     bitmap_clear(s->in_flight_bitmap, chunk_num, nb_chunks);
-    if (s->cow_bitmap && ret >= 0) {
-        bitmap_set(s->cow_bitmap, chunk_num, nb_chunks);
+    if (ret >= 0) {
+        if (s->cow_bitmap) {
+            bitmap_set(s->cow_bitmap, chunk_num, nb_chunks);
+        }
+        s->common.offset += (uint64_t)op->nb_sectors * BDRV_SECTOR_SIZE;
     }
 
     qemu_iovec_destroy(&op->qiov);
@@ -172,7 +178,7 @@  static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
     hbitmap_next_sector = s->sector_num;
     sector_num = s->sector_num;
     sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS;
-    end = s->common.len >> BDRV_SECTOR_BITS;
+    end = s->bdev_length / BDRV_SECTOR_SIZE;
 
     /* Extend the QEMUIOVector to include all adjacent blocks that will
      * be copied in this operation.
@@ -284,6 +290,7 @@  static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
 
     /* Copy the dirty cluster.  */
     s->in_flight++;
+    s->sectors_in_flight += nb_sectors;
     trace_mirror_one_iteration(s, sector_num, nb_sectors);
     bdrv_aio_readv(source, sector_num, &op->qiov, nb_sectors,
                    mirror_read_complete, op);
@@ -329,11 +336,11 @@  static void coroutine_fn mirror_run(void *opaque)
         goto immediate_exit;
     }
 
-    s->common.len = bdrv_getlength(bs);
-    if (s->common.len < 0) {
-        ret = s->common.len;
+    s->bdev_length = bdrv_getlength(bs);
+    if (s->bdev_length < 0) {
+        ret = s->bdev_length;
         goto immediate_exit;
-    } else if (s->common.len == 0) {
+    } else if (s->bdev_length == 0) {
         /* Report BLOCK_JOB_READY and wait for complete. */
         block_job_event_ready(&s->common);
         s->synced = true;
@@ -344,7 +351,7 @@  static void coroutine_fn mirror_run(void *opaque)
         goto immediate_exit;
     }
 
-    length = DIV_ROUND_UP(s->common.len, s->granularity);
+    length = DIV_ROUND_UP(s->bdev_length, s->granularity);
     s->in_flight_bitmap = bitmap_new(length);
 
     /* If we have no backing file yet in the destination, we cannot let
@@ -364,7 +371,7 @@  static void coroutine_fn mirror_run(void *opaque)
         }
     }
 
-    end = s->common.len >> BDRV_SECTOR_BITS;
+    end = s->bdev_length / BDRV_SECTOR_SIZE;
     s->buf = qemu_try_blockalign(bs, s->buf_size);
     if (s->buf == NULL) {
         ret = -ENOMEM;
@@ -409,6 +416,12 @@  static void coroutine_fn mirror_run(void *opaque)
         }
 
         cnt = bdrv_get_dirty_count(bs, s->dirty_bitmap);
+        /* s->common.offset contains the number of bytes already processed so
+         * far, cnt is the number of dirty sectors remaining and
+         * s->sectors_in_flight is the number of sectors currently being
+         * processed; together those are the current total operation length */
+        s->common.len = s->common.offset +
+                        (cnt + s->sectors_in_flight) * BDRV_SECTOR_SIZE;
 
         /* Note that even when no rate limit is applied we need to yield
          * periodically with no pending I/O so that qemu_aio_flush() returns.
@@ -445,7 +458,6 @@  static void coroutine_fn mirror_run(void *opaque)
                  * report completion.  This way, block-job-cancel will leave
                  * the target in a consistent state.
                  */
-                s->common.offset = end * BDRV_SECTOR_SIZE;
                 if (!s->synced) {
                     block_job_event_ready(&s->common);
                     s->synced = true;
@@ -474,8 +486,6 @@  static void coroutine_fn mirror_run(void *opaque)
         ret = 0;
         trace_mirror_before_sleep(s, cnt, s->synced, delay_ns);
         if (!s->synced) {
-            /* Publish progress */
-            s->common.offset = (end - cnt) * BDRV_SECTOR_SIZE;
             block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, delay_ns);
             if (block_job_is_cancelled(&s->common)) {
                 break;