Patchwork [1/7] raw-posix: add raw_get_aio_fd() for virtio-blk-data-plane

login
register
mail settings
Submitter Stefan Hajnoczi
Date Nov. 15, 2012, 3:19 p.m.
Message ID <1352992746-8767-2-git-send-email-stefanha@redhat.com>
Download mbox | patch
Permalink /patch/199319/
State New
Headers show

Comments

Stefan Hajnoczi - Nov. 15, 2012, 3:19 p.m.
The raw_get_aio_fd() function allows virtio-blk-data-plane to get the
file descriptor of a raw image file with Linux AIO enabled.  This
interface is really a layering violation that can be resolved once the
block layer is able to run outside the global mutex - at that point
virtio-blk-data-plane will switch from custom Linux AIO code to using
the block layer.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block.h           |  9 +++++++++
 block/raw-posix.c | 34 ++++++++++++++++++++++++++++++++++
 2 files changed, 43 insertions(+)
Anthony Liguori - Nov. 15, 2012, 8:03 p.m.
Stefan Hajnoczi <stefanha@redhat.com> writes:

> The raw_get_aio_fd() function allows virtio-blk-data-plane to get the
> file descriptor of a raw image file with Linux AIO enabled.  This
> interface is really a layering violation that can be resolved once the
> block layer is able to run outside the global mutex - at that point
> virtio-blk-data-plane will switch from custom Linux AIO code to using
> the block layer.
>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

I think this creates user confusion because virtio-blk-data-plane can't
actually take a BDS.

So why not just make a string 'filename' property and open it directly
in virtio-blk-data-plane?  Then it's at least clear to the user and
management tools what the device is capable of doing.

Regards,

Anthony Liguori

> ---
>  block.h           |  9 +++++++++
>  block/raw-posix.c | 34 ++++++++++++++++++++++++++++++++++
>  2 files changed, 43 insertions(+)
>
> diff --git a/block.h b/block.h
> index 722c620..2dc6aaf 100644
> --- a/block.h
> +++ b/block.h
> @@ -365,6 +365,15 @@ void bdrv_disable_copy_on_read(BlockDriverState *bs);
>  void bdrv_set_in_use(BlockDriverState *bs, int in_use);
>  int bdrv_in_use(BlockDriverState *bs);
>  
> +#ifdef CONFIG_LINUX_AIO
> +int raw_get_aio_fd(BlockDriverState *bs);
> +#else
> +static inline int raw_get_aio_fd(BlockDriverState *bs)
> +{
> +    return -ENOTSUP;
> +}
> +#endif
> +
>  enum BlockAcctType {
>      BDRV_ACCT_READ,
>      BDRV_ACCT_WRITE,
> diff --git a/block/raw-posix.c b/block/raw-posix.c
> index f2f0404..fc04981 100644
> --- a/block/raw-posix.c
> +++ b/block/raw-posix.c
> @@ -1768,6 +1768,40 @@ static BlockDriver bdrv_host_cdrom = {
>  };
>  #endif /* __FreeBSD__ */
>  
> +#ifdef CONFIG_LINUX_AIO
> +/**
> + * Return the file descriptor for Linux AIO
> + *
> + * This function is a layering violation and should be removed when it becomes
> + * possible to call the block layer outside the global mutex.  It allows the
> + * caller to hijack the file descriptor so I/O can be performed outside the
> + * block layer.
> + */
> +int raw_get_aio_fd(BlockDriverState *bs)
> +{
> +    BDRVRawState *s;
> +
> +    if (!bs->drv) {
> +        return -ENOMEDIUM;
> +    }
> +
> +    if (bs->drv == bdrv_find_format("raw")) {
> +        bs = bs->file;
> +    }
> +
> +    /* raw-posix has several protocols so just check for raw_aio_readv */
> +    if (bs->drv->bdrv_aio_readv != raw_aio_readv) {
> +        return -ENOTSUP;
> +    }
> +
> +    s = bs->opaque;
> +    if (!s->use_aio) {
> +        return -ENOTSUP;
> +    }
> +    return s->fd;
> +}
> +#endif /* CONFIG_LINUX_AIO */
> +
>  static void bdrv_file_init(void)
>  {
>      /*
> -- 
> 1.8.0
Stefan Hajnoczi - Nov. 16, 2012, 6:15 a.m.
On Thu, Nov 15, 2012 at 9:03 PM, Anthony Liguori <aliguori@us.ibm.com> wrote:
> Stefan Hajnoczi <stefanha@redhat.com> writes:
>
>> The raw_get_aio_fd() function allows virtio-blk-data-plane to get the
>> file descriptor of a raw image file with Linux AIO enabled.  This
>> interface is really a layering violation that can be resolved once the
>> block layer is able to run outside the global mutex - at that point
>> virtio-blk-data-plane will switch from custom Linux AIO code to using
>> the block layer.
>>
>> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
>
> I think this creates user confusion because virtio-blk-data-plane can't
> actually take a BDS.
>
> So why not just make a string 'filename' property and open it directly
> in virtio-blk-data-plane?  Then it's at least clear to the user and
> management tools what the device is capable of doing.

There are some benefits to raw_get_aio_fd():

1. virtio-blk-data-plane is only a subset virtio-blk implementation,
it still needs a regular virtio-blk-pci device (with BDS) in order to
run.  If we use a filename the user would have to specify it twice.

2. Fetching the file descriptor in this way ensures that the image
file is format=raw.

3. virtio-blk-data-plane uses Linux AIO and raw-posix.c has checks
which I don't want to duplicate - we can simply check s->use_aio in
raw_get_aio_fd() to confirm that Linux AIO can be used.

If we open a file directly then we lose these benefits.  Do you still
think we should open a filename?

Stefan
Paolo Bonzini - Nov. 16, 2012, 8:22 a.m.
Il 16/11/2012 07:15, Stefan Hajnoczi ha scritto:
>> >
>> > So why not just make a string 'filename' property and open it directly
>> > in virtio-blk-data-plane?  Then it's at least clear to the user and
>> > management tools what the device is capable of doing.
> There are some benefits to raw_get_aio_fd():
> 
> 1. virtio-blk-data-plane is only a subset virtio-blk implementation,
> it still needs a regular virtio-blk-pci device (with BDS) in order to
> run.  If we use a filename the user would have to specify it twice.
> 
> 2. Fetching the file descriptor in this way ensures that the image
> file is format=raw.
> 
> 3. virtio-blk-data-plane uses Linux AIO and raw-posix.c has checks
> which I don't want to duplicate - we can simply check s->use_aio in
> raw_get_aio_fd() to confirm that Linux AIO can be used.

Agreed.  This is not vhost-blk, for which I agree that opening the file
would make more sense (so you have no BDS at all).  It's just a stopgap
measure for something that should become the standard implementation.

Paolo

Patch

diff --git a/block.h b/block.h
index 722c620..2dc6aaf 100644
--- a/block.h
+++ b/block.h
@@ -365,6 +365,15 @@  void bdrv_disable_copy_on_read(BlockDriverState *bs);
 void bdrv_set_in_use(BlockDriverState *bs, int in_use);
 int bdrv_in_use(BlockDriverState *bs);
 
+#ifdef CONFIG_LINUX_AIO
+int raw_get_aio_fd(BlockDriverState *bs);
+#else
+static inline int raw_get_aio_fd(BlockDriverState *bs)
+{
+    return -ENOTSUP;
+}
+#endif
+
 enum BlockAcctType {
     BDRV_ACCT_READ,
     BDRV_ACCT_WRITE,
diff --git a/block/raw-posix.c b/block/raw-posix.c
index f2f0404..fc04981 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -1768,6 +1768,40 @@  static BlockDriver bdrv_host_cdrom = {
 };
 #endif /* __FreeBSD__ */
 
+#ifdef CONFIG_LINUX_AIO
+/**
+ * Return the file descriptor for Linux AIO
+ *
+ * This function is a layering violation and should be removed when it becomes
+ * possible to call the block layer outside the global mutex.  It allows the
+ * caller to hijack the file descriptor so I/O can be performed outside the
+ * block layer.
+ */
+int raw_get_aio_fd(BlockDriverState *bs)
+{
+    BDRVRawState *s;
+
+    if (!bs->drv) {
+        return -ENOMEDIUM;
+    }
+
+    if (bs->drv == bdrv_find_format("raw")) {
+        bs = bs->file;
+    }
+
+    /* raw-posix has several protocols so just check for raw_aio_readv */
+    if (bs->drv->bdrv_aio_readv != raw_aio_readv) {
+        return -ENOTSUP;
+    }
+
+    s = bs->opaque;
+    if (!s->use_aio) {
+        return -ENOTSUP;
+    }
+    return s->fd;
+}
+#endif /* CONFIG_LINUX_AIO */
+
 static void bdrv_file_init(void)
 {
     /*