Patchwork [1/8] block: add bdrv_aio_stream

login
register
mail settings
Submitter Stefan Hajnoczi
Date April 27, 2011, 1:27 p.m.
Message ID <1303910855-28999-2-git-send-email-stefanha@linux.vnet.ibm.com>
Download mbox | patch
Permalink /patch/93042/
State New
Headers show

Comments

Stefan Hajnoczi - April 27, 2011, 1:27 p.m.
From: Anthony Liguori <aliguori@us.ibm.com>

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
---
 block.c     |   32 ++++++++++++++++++++++++++++++++
 block.h     |    2 ++
 block_int.h |    3 +++
 3 files changed, 37 insertions(+), 0 deletions(-)
Kevin Wolf - April 29, 2011, 11:56 a.m.
Am 27.04.2011 15:27, schrieb Stefan Hajnoczi:
> From: Anthony Liguori <aliguori@us.ibm.com>
> 
> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
> ---
>  block.c     |   32 ++++++++++++++++++++++++++++++++
>  block.h     |    2 ++
>  block_int.h |    3 +++
>  3 files changed, 37 insertions(+), 0 deletions(-)
> 
> diff --git a/block.c b/block.c
> index f731c7a..5e3476c 100644
> --- a/block.c
> +++ b/block.c
> @@ -2248,6 +2248,38 @@ BlockDriverAIOCB *bdrv_aio_writev(BlockDriverState *bs, int64_t sector_num,
>      return ret;
>  }
>  
> +/**
> + * Attempt to stream an image starting from sector_num.
> + *
> + * @sector_num - the first sector to start streaming from
> + * @cb - block completion callback
> + * @opaque - data to pass completion callback
> + *
> + * Returns NULL if the image format not support streaming, the image is
> + * read-only, or no image is open.
> + *
> + * The intention of this function is for a user to execute it once with a
> + * sector_num of 0 and then upon receiving a completion callback, to remember
> + * the number of sectors "streamed", and then to call this function again with
> + * an offset adjusted by the number of sectors previously streamed.
> + *
> + * This allows a user to progressive stream in an image at a pace that makes
> + * sense.  In general, this function tries to do the smallest amount of I/O
> + * possible to do some useful work.
> + *
> + * This function only really makes sense in combination with a block format
> + * that supports copy on read and has it enabled.  If copy on read is not
> + * enabled, a block format driver may return NULL.
> + */
> +BlockDriverAIOCB *bdrv_aio_stream(BlockDriverState *bs, int64_t sector_num,
> +                                  BlockDriverCompletionFunc *cb, void *opaque)

I think bdrv_aio_stream is a bad name for this. It only becomes image
streaming because the caller repeatedly calls this function. What the
function really does is copying some data from the backing file into the
overlay image.

I'm not sure how the caller would know how many sectors have been
copied. A BlockDriverCompletionFunc usually returns 0 on success, did
you change it here to use positive numbers for something else? At least
this must be documented somewhere, but I would suggest to add a
nb_sectors argument instead so that the caller decides how many sectors
to copy.

If you say that it only makes sense with copy on read, should one think
of it as a read that throws the read data away? I think considering it a
copy function makes more sense and is independent of copy on read.

Kevin
Stefan Hajnoczi - May 6, 2011, 1:21 p.m.
On Fri, Apr 29, 2011 at 12:56 PM, Kevin Wolf <kwolf@redhat.com> wrote:
> Am 27.04.2011 15:27, schrieb Stefan Hajnoczi:
>> +/**
>> + * Attempt to stream an image starting from sector_num.
>> + *
>> + * @sector_num - the first sector to start streaming from
>> + * @cb - block completion callback
>> + * @opaque - data to pass completion callback
>> + *
>> + * Returns NULL if the image format not support streaming, the image is
>> + * read-only, or no image is open.
>> + *
>> + * The intention of this function is for a user to execute it once with a
>> + * sector_num of 0 and then upon receiving a completion callback, to remember
>> + * the number of sectors "streamed", and then to call this function again with
>> + * an offset adjusted by the number of sectors previously streamed.
>> + *
>> + * This allows a user to progressive stream in an image at a pace that makes
>> + * sense.  In general, this function tries to do the smallest amount of I/O
>> + * possible to do some useful work.
>> + *
>> + * This function only really makes sense in combination with a block format
>> + * that supports copy on read and has it enabled.  If copy on read is not
>> + * enabled, a block format driver may return NULL.
>> + */
>> +BlockDriverAIOCB *bdrv_aio_stream(BlockDriverState *bs, int64_t sector_num,
>> +                                  BlockDriverCompletionFunc *cb, void *opaque)
>
> I think bdrv_aio_stream is a bad name for this. It only becomes image
> streaming because the caller repeatedly calls this function. What the
> function really does is copying some data from the backing file into the
> overlay image.

That's true but bdrv_aio_copy_from_backing_file() is a bit long.  The
special thing about this operation is that it takes a starting
sector_num but no length.  The callback receives the nb_sectors.  So
this operation isn't an ordinary [start, length) copy either so
bdrv_aio_stream() isn't that bad?

> I'm not sure how the caller would know how many sectors have been
> copied. A BlockDriverCompletionFunc usually returns 0 on success, did
> you change it here to use positive numbers for something else? At least
> this must be documented somewhere, but I would suggest to add a
> nb_sectors argument instead so that the caller decides how many sectors
> to copy.

Yes, I agree that a separate nb_sectors argument would be clearer.

> If you say that it only makes sense with copy on read, should one think
> of it as a read that throws the read data away? I think considering it a
> copy function makes more sense and is independent of copy on read.

I actually think the copy-on-read statement is an implementation
detail.  I can imagine doing essentially the same behavior without
exposing copy on read to the user.  But in QED streaming is based on
copy-on-read.  Let's remove this comment.

Stefan
Kevin Wolf - May 6, 2011, 1:36 p.m.
Am 06.05.2011 15:21, schrieb Stefan Hajnoczi:
> On Fri, Apr 29, 2011 at 12:56 PM, Kevin Wolf <kwolf@redhat.com> wrote:
>> Am 27.04.2011 15:27, schrieb Stefan Hajnoczi:
>>> +/**
>>> + * Attempt to stream an image starting from sector_num.
>>> + *
>>> + * @sector_num - the first sector to start streaming from
>>> + * @cb - block completion callback
>>> + * @opaque - data to pass completion callback
>>> + *
>>> + * Returns NULL if the image format not support streaming, the image is
>>> + * read-only, or no image is open.
>>> + *
>>> + * The intention of this function is for a user to execute it once with a
>>> + * sector_num of 0 and then upon receiving a completion callback, to remember
>>> + * the number of sectors "streamed", and then to call this function again with
>>> + * an offset adjusted by the number of sectors previously streamed.
>>> + *
>>> + * This allows a user to progressive stream in an image at a pace that makes
>>> + * sense.  In general, this function tries to do the smallest amount of I/O
>>> + * possible to do some useful work.
>>> + *
>>> + * This function only really makes sense in combination with a block format
>>> + * that supports copy on read and has it enabled.  If copy on read is not
>>> + * enabled, a block format driver may return NULL.
>>> + */
>>> +BlockDriverAIOCB *bdrv_aio_stream(BlockDriverState *bs, int64_t sector_num,
>>> +                                  BlockDriverCompletionFunc *cb, void *opaque)
>>
>> I think bdrv_aio_stream is a bad name for this. It only becomes image
>> streaming because the caller repeatedly calls this function. What the
>> function really does is copying some data from the backing file into the
>> overlay image.
> 
> That's true but bdrv_aio_copy_from_backing_file() is a bit long. 

bdrv_copy_backing() or something should be short enough and still
describes what it's really doing.

> The
> special thing about this operation is that it takes a starting
> sector_num but no length.  The callback receives the nb_sectors.  So
> this operation isn't an ordinary [start, length) copy either so
> bdrv_aio_stream() isn't that bad?

Well, you're going to introduce nb_sectors anyway, so it's not really
special any more.

> I actually think the copy-on-read statement is an implementation
> detail.  I can imagine doing essentially the same behavior without
> exposing copy on read to the user.  But in QED streaming is based on
> copy-on-read.  Let's remove this comment.

Ok. Removing the comment and calling it something with "copy" in the
name should make clear what the intention is.

Kevin
Stefan Hajnoczi - May 6, 2011, 3:47 p.m.
On Fri, May 6, 2011 at 2:36 PM, Kevin Wolf <kwolf@redhat.com> wrote:
> Am 06.05.2011 15:21, schrieb Stefan Hajnoczi:
>> On Fri, Apr 29, 2011 at 12:56 PM, Kevin Wolf <kwolf@redhat.com> wrote:
>>> Am 27.04.2011 15:27, schrieb Stefan Hajnoczi:
>>>> +/**
>>>> + * Attempt to stream an image starting from sector_num.
>>>> + *
>>>> + * @sector_num - the first sector to start streaming from
>>>> + * @cb - block completion callback
>>>> + * @opaque - data to pass completion callback
>>>> + *
>>>> + * Returns NULL if the image format not support streaming, the image is
>>>> + * read-only, or no image is open.
>>>> + *
>>>> + * The intention of this function is for a user to execute it once with a
>>>> + * sector_num of 0 and then upon receiving a completion callback, to remember
>>>> + * the number of sectors "streamed", and then to call this function again with
>>>> + * an offset adjusted by the number of sectors previously streamed.
>>>> + *
>>>> + * This allows a user to progressive stream in an image at a pace that makes
>>>> + * sense.  In general, this function tries to do the smallest amount of I/O
>>>> + * possible to do some useful work.
>>>> + *
>>>> + * This function only really makes sense in combination with a block format
>>>> + * that supports copy on read and has it enabled.  If copy on read is not
>>>> + * enabled, a block format driver may return NULL.
>>>> + */
>>>> +BlockDriverAIOCB *bdrv_aio_stream(BlockDriverState *bs, int64_t sector_num,
>>>> +                                  BlockDriverCompletionFunc *cb, void *opaque)
>>>
>>> I think bdrv_aio_stream is a bad name for this. It only becomes image
>>> streaming because the caller repeatedly calls this function. What the
>>> function really does is copying some data from the backing file into the
>>> overlay image.
>>
>> That's true but bdrv_aio_copy_from_backing_file() is a bit long.
>
> bdrv_copy_backing() or something should be short enough and still
> describes what it's really doing.
>
>> The
>> special thing about this operation is that it takes a starting
>> sector_num but no length.  The callback receives the nb_sectors.  So
>> this operation isn't an ordinary [start, length) copy either so
>> bdrv_aio_stream() isn't that bad?
>
> Well, you're going to introduce nb_sectors anyway, so it's not really
> special any more.

I guess you're right.  First I wasn't planning on passing nb_sectors
to this function since its the blockdev.c streaming loop that drives
the streaming process - we may not need nb_sectors here.  But I guess
this is like a read(2) function and the block driver can return short
reads if that is convenient due to cluster sizes or other image format
internals.  By passing in nb_sectors we avoid streaming too much at
the end.

Stefan

Patch

diff --git a/block.c b/block.c
index f731c7a..5e3476c 100644
--- a/block.c
+++ b/block.c
@@ -2248,6 +2248,38 @@  BlockDriverAIOCB *bdrv_aio_writev(BlockDriverState *bs, int64_t sector_num,
     return ret;
 }
 
+/**
+ * Attempt to stream an image starting from sector_num.
+ *
+ * @sector_num - the first sector to start streaming from
+ * @cb - block completion callback
+ * @opaque - data to pass completion callback
+ *
+ * Returns NULL if the image format not support streaming, the image is
+ * read-only, or no image is open.
+ *
+ * The intention of this function is for a user to execute it once with a
+ * sector_num of 0 and then upon receiving a completion callback, to remember
+ * the number of sectors "streamed", and then to call this function again with
+ * an offset adjusted by the number of sectors previously streamed.
+ *
+ * This allows a user to progressive stream in an image at a pace that makes
+ * sense.  In general, this function tries to do the smallest amount of I/O
+ * possible to do some useful work.
+ *
+ * This function only really makes sense in combination with a block format
+ * that supports copy on read and has it enabled.  If copy on read is not
+ * enabled, a block format driver may return NULL.
+ */
+BlockDriverAIOCB *bdrv_aio_stream(BlockDriverState *bs, int64_t sector_num,
+                                  BlockDriverCompletionFunc *cb, void *opaque)
+{
+    if (!bs->drv || bs->read_only || !bs->drv->bdrv_aio_stream) {
+        return NULL;
+    }
+
+    return bs->drv->bdrv_aio_stream(bs, sector_num, cb, opaque);
+}
 
 typedef struct MultiwriteCB {
     int error;
diff --git a/block.h b/block.h
index 52e9cad..fad828a 100644
--- a/block.h
+++ b/block.h
@@ -119,6 +119,8 @@  BlockDriverAIOCB *bdrv_aio_writev(BlockDriverState *bs, int64_t sector_num,
                                   BlockDriverCompletionFunc *cb, void *opaque);
 BlockDriverAIOCB *bdrv_aio_flush(BlockDriverState *bs,
 				 BlockDriverCompletionFunc *cb, void *opaque);
+BlockDriverAIOCB *bdrv_aio_stream(BlockDriverState *bs, int64_t sector_num,
+                                  BlockDriverCompletionFunc *cb, void *opaque);
 void bdrv_aio_cancel(BlockDriverAIOCB *acb);
 
 typedef struct BlockRequest {
diff --git a/block_int.h b/block_int.h
index 545ad11..0c125d0 100644
--- a/block_int.h
+++ b/block_int.h
@@ -73,6 +73,9 @@  struct BlockDriver {
         BlockDriverCompletionFunc *cb, void *opaque);
     BlockDriverAIOCB *(*bdrv_aio_flush)(BlockDriverState *bs,
         BlockDriverCompletionFunc *cb, void *opaque);
+    BlockDriverAIOCB *(*bdrv_aio_stream)(BlockDriverState *bs,
+        int64_t sector_num,
+        BlockDriverCompletionFunc *cb, void *opaque);
     int (*bdrv_discard)(BlockDriverState *bs, int64_t sector_num,
                         int nb_sectors);