Message ID | 1413982283-10186-8-git-send-email-mreitz@redhat.com |
---|---|
State | New |
Headers | show |
Am 22.10.2014 um 14:51 hat Max Reitz geschrieben: > Instead of taking the total length of the block device as the block > job's length, use the number of dirty sectors. The progress is now the > number of sectors mirrored to the target block device. Note that this > may result in the job's length increasing during operation, which is > however in fact desirable. More importantly, because it might surprise management tools, is that the progress (as in offset/len) can actually decrease now. I can't say whether that creates any problem, I'll rely on Eric's Reviewed-by for that. > Signed-off-by: Max Reitz <mreitz@redhat.com> > Reviewed-by: Eric Blake <eblake@redhat.com> > --- > block/mirror.c | 34 ++++++++++++++++++++++------------ > 1 file changed, 22 insertions(+), 12 deletions(-) > @@ -409,6 +416,12 @@ static void coroutine_fn mirror_run(void *opaque) > } > > cnt = bdrv_get_dirty_count(bs, s->dirty_bitmap); > + /* s->common.offset contains the number of bytes already processed so > + * far, cnt is the number of dirty sectors remaining and > + * s->sectors_in_flight is the number of sectors currently being > + * processed; together those are the current total operation length */ > + s->common.len = s->common.offset + > + (cnt + s->sectors_in_flight) * BDRV_SECTOR_SIZE; Isn't s->sectors_in_flight still contained in cnt? If I understand correctly, sectors are only marked as clean at the same time as s->sectors_in_flight is decremented again. Kevin
On 2014-10-23 at 12:52, Kevin Wolf wrote: > Am 22.10.2014 um 14:51 hat Max Reitz geschrieben: >> Instead of taking the total length of the block device as the block >> job's length, use the number of dirty sectors. The progress is now the >> number of sectors mirrored to the target block device. Note that this >> may result in the job's length increasing during operation, which is >> however in fact desirable. > More importantly, because it might surprise management tools, is that > the progress (as in offset/len) can actually decrease now. > > I can't say whether that creates any problem, I'll rely on Eric's > Reviewed-by for that. Eric said, it would rather solve problems. >> Signed-off-by: Max Reitz <mreitz@redhat.com> >> Reviewed-by: Eric Blake <eblake@redhat.com> >> --- >> block/mirror.c | 34 ++++++++++++++++++++++------------ >> 1 file changed, 22 insertions(+), 12 deletions(-) >> @@ -409,6 +416,12 @@ static void coroutine_fn mirror_run(void *opaque) >> } >> >> cnt = bdrv_get_dirty_count(bs, s->dirty_bitmap); >> + /* s->common.offset contains the number of bytes already processed so >> + * far, cnt is the number of dirty sectors remaining and >> + * s->sectors_in_flight is the number of sectors currently being >> + * processed; together those are the current total operation length */ >> + s->common.len = s->common.offset + >> + (cnt + s->sectors_in_flight) * BDRV_SECTOR_SIZE; > Isn't s->sectors_in_flight still contained in cnt? If I understand > correctly, sectors are only marked as clean at the same time as > s->sectors_in_flight is decremented again. As far as I see, bdrv_reset_dirty() is called in mirror_iteration() right before s->sectors_in_flight is incremented. Max
Am 23.10.2014 um 13:09 hat Max Reitz geschrieben: > On 2014-10-23 at 12:52, Kevin Wolf wrote: > >Am 22.10.2014 um 14:51 hat Max Reitz geschrieben: > >>Instead of taking the total length of the block device as the block > >>job's length, use the number of dirty sectors. The progress is now the > >>number of sectors mirrored to the target block device. Note that this > >>may result in the job's length increasing during operation, which is > >>however in fact desirable. > >More importantly, because it might surprise management tools, is that > >the progress (as in offset/len) can actually decrease now. > > > >I can't say whether that creates any problem, I'll rely on Eric's > >Reviewed-by for that. > > Eric said, it would rather solve problems. > > >>Signed-off-by: Max Reitz <mreitz@redhat.com> > >>Reviewed-by: Eric Blake <eblake@redhat.com> > >>--- > >> block/mirror.c | 34 ++++++++++++++++++++++------------ > >> 1 file changed, 22 insertions(+), 12 deletions(-) > >>@@ -409,6 +416,12 @@ static void coroutine_fn mirror_run(void *opaque) > >> } > >> cnt = bdrv_get_dirty_count(bs, s->dirty_bitmap); > >>+ /* s->common.offset contains the number of bytes already processed so > >>+ * far, cnt is the number of dirty sectors remaining and > >>+ * s->sectors_in_flight is the number of sectors currently being > >>+ * processed; together those are the current total operation length */ > >>+ s->common.len = s->common.offset + > >>+ (cnt + s->sectors_in_flight) * BDRV_SECTOR_SIZE; > >Isn't s->sectors_in_flight still contained in cnt? If I understand > >correctly, sectors are only marked as clean at the same time as > >s->sectors_in_flight is decremented again. > > As far as I see, bdrv_reset_dirty() is called in mirror_iteration() > right before s->sectors_in_flight is incremented. You're right, I confused it with the other bitmap operations in mirror_iteration_done(). So: Reviewed-by: Kevin Wolf <kwolf@redhat.com>
On 10/23/2014 04:52 AM, Kevin Wolf wrote: > Am 22.10.2014 um 14:51 hat Max Reitz geschrieben: >> Instead of taking the total length of the block device as the block >> job's length, use the number of dirty sectors. The progress is now the >> number of sectors mirrored to the target block device. Note that this >> may result in the job's length increasing during operation, which is >> however in fact desirable. > > More importantly, because it might surprise management tools, is that > the progress (as in offset/len) can actually decrease now. Decreased progress is a GOOD thing - it tells management that the guest is causing I/O faster than the mirroring can keep up with, and that it might be time to turn on some throttling if the mirror is going to have a chance at converging. The old code didn't have any correlation to progress complete vs work remaining, making it harder to accurately estimate how long a job still has to go. > > I can't say whether that creates any problem, I'll rely on Eric's > Reviewed-by for that. Libvirt has already documented that for all jobs (including block jobs), progress is estimated by offset/total, but that total need not be constant between two consecutive calls, precisely for the behavior that this patch moves toward.
diff --git a/block/mirror.c b/block/mirror.c index e8a43eb..2a1acfe 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -45,6 +45,7 @@ typedef struct MirrorBlockJob { int64_t sector_num; int64_t granularity; size_t buf_size; + int64_t bdev_length; unsigned long *cow_bitmap; BdrvDirtyBitmap *dirty_bitmap; HBitmapIter hbi; @@ -54,6 +55,7 @@ typedef struct MirrorBlockJob { unsigned long *in_flight_bitmap; int in_flight; + int sectors_in_flight; int ret; } MirrorBlockJob; @@ -87,6 +89,7 @@ static void mirror_iteration_done(MirrorOp *op, int ret) trace_mirror_iteration_done(s, op->sector_num, op->nb_sectors, ret); s->in_flight--; + s->sectors_in_flight -= op->nb_sectors; iov = op->qiov.iov; for (i = 0; i < op->qiov.niov; i++) { MirrorBuffer *buf = (MirrorBuffer *) iov[i].iov_base; @@ -98,8 +101,11 @@ static void mirror_iteration_done(MirrorOp *op, int ret) chunk_num = op->sector_num / sectors_per_chunk; nb_chunks = op->nb_sectors / sectors_per_chunk; bitmap_clear(s->in_flight_bitmap, chunk_num, nb_chunks); - if (s->cow_bitmap && ret >= 0) { - bitmap_set(s->cow_bitmap, chunk_num, nb_chunks); + if (ret >= 0) { + if (s->cow_bitmap) { + bitmap_set(s->cow_bitmap, chunk_num, nb_chunks); + } + s->common.offset += (uint64_t)op->nb_sectors * BDRV_SECTOR_SIZE; } qemu_iovec_destroy(&op->qiov); @@ -172,7 +178,7 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s) hbitmap_next_sector = s->sector_num; sector_num = s->sector_num; sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS; - end = s->common.len >> BDRV_SECTOR_BITS; + end = s->bdev_length / BDRV_SECTOR_SIZE; /* Extend the QEMUIOVector to include all adjacent blocks that will * be copied in this operation. @@ -284,6 +290,7 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s) /* Copy the dirty cluster. */ s->in_flight++; + s->sectors_in_flight += nb_sectors; trace_mirror_one_iteration(s, sector_num, nb_sectors); bdrv_aio_readv(source, sector_num, &op->qiov, nb_sectors, mirror_read_complete, op); @@ -329,11 +336,11 @@ static void coroutine_fn mirror_run(void *opaque) goto immediate_exit; } - s->common.len = bdrv_getlength(bs); - if (s->common.len < 0) { - ret = s->common.len; + s->bdev_length = bdrv_getlength(bs); + if (s->bdev_length < 0) { + ret = s->bdev_length; goto immediate_exit; - } else if (s->common.len == 0) { + } else if (s->bdev_length == 0) { /* Report BLOCK_JOB_READY and wait for complete. */ block_job_event_ready(&s->common); s->synced = true; @@ -344,7 +351,7 @@ static void coroutine_fn mirror_run(void *opaque) goto immediate_exit; } - length = DIV_ROUND_UP(s->common.len, s->granularity); + length = DIV_ROUND_UP(s->bdev_length, s->granularity); s->in_flight_bitmap = bitmap_new(length); /* If we have no backing file yet in the destination, we cannot let @@ -364,7 +371,7 @@ static void coroutine_fn mirror_run(void *opaque) } } - end = s->common.len >> BDRV_SECTOR_BITS; + end = s->bdev_length / BDRV_SECTOR_SIZE; s->buf = qemu_try_blockalign(bs, s->buf_size); if (s->buf == NULL) { ret = -ENOMEM; @@ -409,6 +416,12 @@ static void coroutine_fn mirror_run(void *opaque) } cnt = bdrv_get_dirty_count(bs, s->dirty_bitmap); + /* s->common.offset contains the number of bytes already processed so + * far, cnt is the number of dirty sectors remaining and + * s->sectors_in_flight is the number of sectors currently being + * processed; together those are the current total operation length */ + s->common.len = s->common.offset + + (cnt + s->sectors_in_flight) * BDRV_SECTOR_SIZE; /* Note that even when no rate limit is applied we need to yield * periodically with no pending I/O so that qemu_aio_flush() returns. @@ -445,7 +458,6 @@ static void coroutine_fn mirror_run(void *opaque) * report completion. This way, block-job-cancel will leave * the target in a consistent state. */ - s->common.offset = end * BDRV_SECTOR_SIZE; if (!s->synced) { block_job_event_ready(&s->common); s->synced = true; @@ -474,8 +486,6 @@ static void coroutine_fn mirror_run(void *opaque) ret = 0; trace_mirror_before_sleep(s, cnt, s->synced, delay_ns); if (!s->synced) { - /* Publish progress */ - s->common.offset = (end - cnt) * BDRV_SECTOR_SIZE; block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, delay_ns); if (block_job_is_cancelled(&s->common)) { break;