Patchwork [v4,13/15] block stream: add support for partial streaming

login
register
mail settings
Submitter Stefan Hajnoczi
Date Jan. 6, 2012, 2:01 p.m.
Message ID <1325858501-25741-14-git-send-email-stefanha@linux.vnet.ibm.com>
Download mbox | patch
Permalink /patch/134658/
State New
Headers show

Comments

Stefan Hajnoczi - Jan. 6, 2012, 2:01 p.m.
From: Marcelo Tosatti <mtosatti@redhat.com>

Add support for streaming data from an intermediate section of the
image chain (see patch and documentation for details).

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
---
 block.c        |   64 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 block.h        |    4 +++
 block/stream.c |   28 +++++++++++++++++++++---
 block_int.h    |    3 +-
 blockdev.c     |   11 ++++++---
 5 files changed, 101 insertions(+), 9 deletions(-)
Kevin Wolf - Jan. 12, 2012, 12:42 p.m.
Am 06.01.2012 15:01, schrieb Stefan Hajnoczi:
> From: Marcelo Tosatti <mtosatti@redhat.com>
> 
> Add support for streaming data from an intermediate section of the
> image chain (see patch and documentation for details).
> 
> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
> ---
>  block.c        |   64 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  block.h        |    4 +++
>  block/stream.c |   28 +++++++++++++++++++++---
>  block_int.h    |    3 +-
>  blockdev.c     |   11 ++++++---
>  5 files changed, 101 insertions(+), 9 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 9b688a0..d2143b1 100644
> --- a/block.c
> +++ b/block.c
> @@ -2263,6 +2263,70 @@ int bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
>      return data.ret;
>  }
>  
> +/*
> + * Given an image chain: [BASE] -> [INTER1] -> [INTER2] -> [TOP]
> + *
> + * Return true if the given sector is allocated in top or base.
> + * Return false if the given sector is allocated in intermediate images.

This description is inexact. A sector could be allocated both in base in
an intermediate image.

Also initially I thought that we not only need to check whether the
sector is allocated in BASE, but also in any parents of BASE. You don't
do this: Clusters that are completely unallocated all through the chain
are reported as allocated.

After reading all of the patch, I believe this provides the right
semantics: "Normal" image streaming would copy them into the topmost
file, but if you keep a backing file, you want to copy as little as
possible and keep using the backing file whenever possible.

So I suggest to fix the description rather than the implementation.

Maybe we should also rename the function. With this semantics it's not a
generic is_allocated function any more, but something quite specific to
streaming with a base file.

Kevin
Stefan Hajnoczi - Jan. 12, 2012, 4:14 p.m.
On Thu, Jan 12, 2012 at 12:42 PM, Kevin Wolf <kwolf@redhat.com> wrote:
> Am 06.01.2012 15:01, schrieb Stefan Hajnoczi:
>> From: Marcelo Tosatti <mtosatti@redhat.com>
>>
>> Add support for streaming data from an intermediate section of the
>> image chain (see patch and documentation for details).
>>
>> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
>> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
>> ---
>>  block.c        |   64 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>  block.h        |    4 +++
>>  block/stream.c |   28 +++++++++++++++++++++---
>>  block_int.h    |    3 +-
>>  blockdev.c     |   11 ++++++---
>>  5 files changed, 101 insertions(+), 9 deletions(-)
>>
>> diff --git a/block.c b/block.c
>> index 9b688a0..d2143b1 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -2263,6 +2263,70 @@ int bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
>>      return data.ret;
>>  }
>>
>> +/*
>> + * Given an image chain: [BASE] -> [INTER1] -> [INTER2] -> [TOP]
>> + *
>> + * Return true if the given sector is allocated in top or base.
>> + * Return false if the given sector is allocated in intermediate images.
>
> This description is inexact. A sector could be allocated both in base in
> an intermediate image.
>
> Also initially I thought that we not only need to check whether the
> sector is allocated in BASE, but also in any parents of BASE. You don't
> do this: Clusters that are completely unallocated all through the chain
> are reported as allocated.
>
> After reading all of the patch, I believe this provides the right
> semantics: "Normal" image streaming would copy them into the topmost
> file, but if you keep a backing file, you want to copy as little as
> possible and keep using the backing file whenever possible.
>
> So I suggest to fix the description rather than the implementation.
>
> Maybe we should also rename the function. With this semantics it's not a
> generic is_allocated function any more, but something quite specific to
> streaming with a base file.

I have moved the function into block/stream.c and renamed it to just
is_allocated_base().  The description is updated.

This makes it clearer that it's a special-case is_allocated-like function.

Stefan

Patch

diff --git a/block.c b/block.c
index 9b688a0..d2143b1 100644
--- a/block.c
+++ b/block.c
@@ -2263,6 +2263,70 @@  int bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
     return data.ret;
 }
 
+/*
+ * Given an image chain: [BASE] -> [INTER1] -> [INTER2] -> [TOP]
+ *
+ * Return true if the given sector is allocated in top or base.
+ * Return false if the given sector is allocated in intermediate images.
+ *
+ * 'pnum' is set to the number of sectors (including and immediately following
+ *  the specified sector) that are known to be in the same
+ *  allocated/unallocated state.
+ *
+ */
+int coroutine_fn bdrv_co_is_allocated_base(BlockDriverState *top,
+                                           BlockDriverState *base,
+                                           int64_t sector_num,
+                                           int nb_sectors, int *pnum)
+{
+    BlockDriverState *intermediate;
+    int ret, n;
+
+    ret = bdrv_co_is_allocated(top, sector_num, nb_sectors, &n);
+    if (ret) {
+        *pnum = n;
+        return ret;
+    }
+
+    /*
+     * Is the unallocated chunk [sector_num, n] also
+     * unallocated between base and top?
+     */
+    intermediate = top->backing_hd;
+
+    while (intermediate) {
+        int pnum_inter;
+
+        /* reached base */
+        if (intermediate == base) {
+            *pnum = n;
+            return 1;
+        }
+        ret = bdrv_co_is_allocated(intermediate, sector_num, nb_sectors,
+                                   &pnum_inter);
+        if (ret < 0) {
+            return ret;
+        } else if (ret) {
+            *pnum = pnum_inter;
+            return 0;
+        }
+
+        /*
+         * [sector_num, nb_sectors] is unallocated on top but intermediate
+         * might have
+         *
+         * [sector_num+x, nr_sectors] allocated.
+         */
+        if (n > pnum_inter) {
+            n = pnum_inter;
+        }
+
+        intermediate = intermediate->backing_hd;
+    }
+
+    return 1;
+}
+
 void bdrv_mon_event(const BlockDriverState *bdrv,
                     BlockMonEventAction action, int is_read)
 {
diff --git a/block.h b/block.h
index a1d9b56..0e786a9 100644
--- a/block.h
+++ b/block.h
@@ -229,6 +229,10 @@  int bdrv_co_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors);
 int bdrv_has_zero_init(BlockDriverState *bs);
 int bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
                       int *pnum);
+int coroutine_fn bdrv_co_is_allocated_base(BlockDriverState *top,
+                                           BlockDriverState *base,
+                                           int64_t sector_num, int nb_sectors,
+                                           int *pnum);
 
 #define BIOS_ATA_TRANSLATION_AUTO   0
 #define BIOS_ATA_TRANSLATION_NONE   1
diff --git a/block/stream.c b/block/stream.c
index 5d5d672..c6f548d 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -57,6 +57,7 @@  typedef struct StreamBlockJob {
     BlockJob common;
     RateLimit limit;
     BlockDriverState *base;
+    char backing_file_id[1024];
 } StreamBlockJob;
 
 static int coroutine_fn stream_populate(BlockDriverState *bs,
@@ -79,6 +80,7 @@  static void coroutine_fn stream_run(void *opaque)
 {
     StreamBlockJob *s = opaque;
     BlockDriverState *bs = s->common.bs;
+    BlockDriverState *base = s->base;
     int64_t sector_num, end;
     int ret = 0;
     int n;
@@ -96,8 +98,17 @@  retry:
             break;
         }
 
-        ret = bdrv_co_is_allocated(bs, sector_num,
-                                   STREAM_BUFFER_SIZE / BDRV_SECTOR_SIZE, &n);
+
+        if (base) {
+            ret = bdrv_co_is_allocated_base(bs, base, sector_num,
+                                            STREAM_BUFFER_SIZE /
+                                            BDRV_SECTOR_SIZE,
+                                            &n);
+        } else {
+            ret = bdrv_co_is_allocated(bs, sector_num,
+                                       STREAM_BUFFER_SIZE / BDRV_SECTOR_SIZE,
+                                       &n);
+        }
         trace_stream_one_iteration(s, sector_num, n, ret);
         if (ret == 0) {
             if (s->common.speed) {
@@ -114,6 +125,7 @@  retry:
         if (ret < 0) {
             break;
         }
+        ret = 0;
 
         /* Publish progress */
         s->common.offset += n * BDRV_SECTOR_SIZE;
@@ -127,7 +139,11 @@  retry:
     bdrv_disable_copy_on_read(bs);
 
     if (sector_num == end && ret == 0) {
-        ret = bdrv_change_backing_file(bs, NULL, NULL);
+        const char *base_id = NULL;
+        if (base) {
+            base_id = s->backing_file_id;
+        }
+        ret = bdrv_change_backing_file(bs, base_id, NULL);
     }
 
     qemu_vfree(buf);
@@ -153,7 +169,8 @@  static BlockJobType stream_job_type = {
 };
 
 int stream_start(BlockDriverState *bs, BlockDriverState *base,
-                 BlockDriverCompletionFunc *cb, void *opaque)
+                 const char *base_id, BlockDriverCompletionFunc *cb,
+                 void *opaque)
 {
     StreamBlockJob *s;
     Coroutine *co;
@@ -164,6 +181,9 @@  int stream_start(BlockDriverState *bs, BlockDriverState *base,
 
     s = block_job_create(&stream_job_type, bs, cb, opaque);
     s->base = base;
+    if (base_id) {
+        pstrcpy(s->backing_file_id, sizeof(s->backing_file_id), base_id);
+    }
 
     co = qemu_coroutine_create(stream_run);
     trace_stream_start(bs, base, s, co, opaque);
diff --git a/block_int.h b/block_int.h
index c7c9178..ed92884 100644
--- a/block_int.h
+++ b/block_int.h
@@ -333,6 +333,7 @@  void block_job_cancel(BlockJob *job);
 bool block_job_is_cancelled(BlockJob *job);
 
 int stream_start(BlockDriverState *bs, BlockDriverState *base,
-                 BlockDriverCompletionFunc *cb, void *opaque);
+                 const char *base_id, BlockDriverCompletionFunc *cb,
+                 void *opaque);
 
 #endif /* BLOCK_INT_H */
diff --git a/blockdev.c b/blockdev.c
index 45a6ba6..5da1097 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -956,6 +956,7 @@  void qmp_block_stream(const char *device, bool has_base,
                       const char *base, Error **errp)
 {
     BlockDriverState *bs;
+    BlockDriverState *base_bs = NULL;
     int ret;
 
     bs = bdrv_find(device);
@@ -964,13 +965,15 @@  void qmp_block_stream(const char *device, bool has_base,
         return;
     }
 
-    /* Base device not supported */
     if (base) {
-        error_set(errp, QERR_NOT_SUPPORTED);
-        return;
+        base_bs = bdrv_find_backing_image(bs, base);
+        if (base_bs == NULL) {
+            error_set(errp, QERR_BASE_ID_NOT_FOUND, base);
+            return;
+        }
     }
 
-    ret = stream_start(bs, NULL, block_stream_cb, bs);
+    ret = stream_start(bs, base_bs, base, block_stream_cb, bs);
     if (ret < 0) {
         switch (ret) {
         case -EBUSY: