Patchwork [v6,12/12] virtio-blk: add x-data-plane=on|off performance feature

login
register
mail settings
Submitter Stefan Hajnoczi
Date Dec. 10, 2012, 1:09 p.m.
Message ID <1355144985-12897-13-git-send-email-stefanha@redhat.com>
Download mbox | patch
Permalink /patch/204875/
State New
Headers show

Comments

Stefan Hajnoczi - Dec. 10, 2012, 1:09 p.m.
The virtio-blk-data-plane feature is easy to integrate into
hw/virtio-blk.c.  The data plane can be started and stopped similar to
vhost-net.

Users can take advantage of the virtio-blk-data-plane feature using the
new -device virtio-blk-pci,x-data-plane=on property.

The x-data-plane name was chosen because at this stage the feature is
experimental and likely to see changes in the future.

If the VM configuration does not support virtio-blk-data-plane an error
message is printed.  Although we could fall back to regular virtio-blk,
I prefer the explicit approach since it prompts the user to fix their
configuration if they want the performance benefit of
virtio-blk-data-plane.

Limitations:
 * Only format=raw is supported
 * Live migration is not supported
 * Block jobs, hot unplug, and other operations fail with -EBUSY
 * I/O throttling limits are ignored
 * Only Linux hosts are supported due to Linux AIO usage

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 hw/virtio-blk.c | 28 +++++++++++++++++++++++++++-
 hw/virtio-pci.c |  3 +++
 2 files changed, 30 insertions(+), 1 deletion(-)
Michael S. Tsirkin - Dec. 16, 2012, 4:08 p.m.
On Mon, Dec 10, 2012 at 02:09:45PM +0100, Stefan Hajnoczi wrote:
> @@ -33,6 +34,7 @@ typedef struct VirtIOBlock
>      VirtIOBlkConf *blk;
>      unsigned short sector_mask;
>      DeviceState *qdev;
> +    VirtIOBlockDataPlane *dataplane;
>  } VirtIOBlock;
>  
>  static VirtIOBlock *to_virtio_blk(VirtIODevice *vdev)
> @@ -407,6 +409,14 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>          .num_writes = 0,
>      };
>  
> +    /* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
> +     * dataplane here instead of waiting for .set_status().
> +     */

By the way which guests are these?

> +    if (s->dataplane) {
> +        virtio_blk_data_plane_start(s->dataplane);
> +        return;
> +    }
> +

By the way it's chunk such as this that I meant: it's not
compiled out even if dataplane is disabled by configure.
Naither is the extra field in the struct.


>      while ((req = virtio_blk_get_request(s))) {
>          virtio_blk_handle_request(req, &mrb);
>      }
Stefan Hajnoczi - Dec. 18, 2012, 2:57 p.m.
On Sun, Dec 16, 2012 at 06:08:53PM +0200, Michael S. Tsirkin wrote:
> On Mon, Dec 10, 2012 at 02:09:45PM +0100, Stefan Hajnoczi wrote:
> > @@ -33,6 +34,7 @@ typedef struct VirtIOBlock
> >      VirtIOBlkConf *blk;
> >      unsigned short sector_mask;
> >      DeviceState *qdev;
> > +    VirtIOBlockDataPlane *dataplane;
> >  } VirtIOBlock;
> >  
> >  static VirtIOBlock *to_virtio_blk(VirtIODevice *vdev)
> > @@ -407,6 +409,14 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
> >          .num_writes = 0,
> >      };
> >  
> > +    /* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
> > +     * dataplane here instead of waiting for .set_status().
> > +     */
> 
> By the way which guests are these?

I ran a Windows 8 guest today with build 48 virtio-win drivers.  It
notifies before the device gets its .set_status() callback invoked.

But I could swear I've seen Linux guests do this too.

> > +    if (s->dataplane) {
> > +        virtio_blk_data_plane_start(s->dataplane);
> > +        return;
> > +    }
> > +
> 
> By the way it's chunk such as this that I meant: it's not
> compiled out even if dataplane is disabled by configure.
> Naither is the extra field in the struct.

Okay.

Stefan
Michael S. Tsirkin - Dec. 18, 2012, 3:22 p.m.
On Tue, Dec 18, 2012 at 03:57:17PM +0100, Stefan Hajnoczi wrote:
> > > @@ -407,6 +409,14 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
> > >          .num_writes = 0,
> > >      };
> > >  
> > > +    /* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
> > > +     * dataplane here instead of waiting for .set_status().
> > > +     */
> > 
> > By the way which guests are these?
> 
> I ran a Windows 8 guest today with build 48 virtio-win drivers.  It
> notifies before the device gets its .set_status() callback invoked.
> But I could swear I've seen Linux guests do this too.


That's very broken. But looking at linux drivers it also
seems linux guests do this even today.
We have:

       err = drv->probe(dev);
        if (err)
                add_status(dev, VIRTIO_CONFIG_S_FAILED);
        else {
                add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
                if (drv->scan)
                        drv->scan(dev);
        }

this means that unless drivers implement scan() they
will make device active before DRIVER_OK is written
as the result linux can access it and kick.
And almost no drivers implement scan.
Nasty.

Rusty, what do you think? Worth fixing?

It does mean that for now we are stuck
with a work around, but I think we need it
in virtio core in qemu, it's not dataplane
specific.
Rusty Russell - Dec. 20, 2012, 4:04 a.m.
"Michael S. Tsirkin" <mst@redhat.com> writes:
> On Tue, Dec 18, 2012 at 03:57:17PM +0100, Stefan Hajnoczi wrote:
>> > > @@ -407,6 +409,14 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>> > >          .num_writes = 0,
>> > >      };
>> > >  
>> > > +    /* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
>> > > +     * dataplane here instead of waiting for .set_status().
>> > > +     */
>> > 
>> > By the way which guests are these?
>> 
>> I ran a Windows 8 guest today with build 48 virtio-win drivers.  It
>> notifies before the device gets its .set_status() callback invoked.
>> But I could swear I've seen Linux guests do this too.
>
>
> That's very broken. But looking at linux drivers it also
> seems linux guests do this even today.
> We have:
>
>        err = drv->probe(dev);
>         if (err)
>                 add_status(dev, VIRTIO_CONFIG_S_FAILED);
>         else {
>                 add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
>                 if (drv->scan)
>                         drv->scan(dev);
>         }
>
> this means that unless drivers implement scan() they
> will make device active before DRIVER_OK is written
> as the result linux can access it and kick.
> And almost no drivers implement scan.
> Nasty.

Yes, that's true.  But as long as they have completed feature
negotiation, we allow this (that's why we tool feature negotiation out
of the drivers).

For example, filling an input virtqueue may well mean we kick the vq,
and almost every device does this.

virtio_block is the worst: add_disk() does partition scanning.

> Rusty, what do you think? Worth fixing?

If we tried, I'm fairly sure things would slip through.  My feeling has
been that we should not rely on the status to indicate readiness.

> It does mean that for now we are stuck
> with a work around, but I think we need it
> in virtio core in qemu, it's not dataplane
> specific.

Yes, it's a general problem.

Cheers,
Rusty.

Patch

diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c
index fabf387..4e7ef64 100644
--- a/hw/virtio-blk.c
+++ b/hw/virtio-blk.c
@@ -17,6 +17,7 @@ 
 #include "hw/block-common.h"
 #include "blockdev.h"
 #include "virtio-blk.h"
+#include "hw/dataplane/virtio-blk.h"
 #include "scsi-defs.h"
 #ifdef __linux__
 # include <scsi/sg.h>
@@ -33,6 +34,7 @@  typedef struct VirtIOBlock
     VirtIOBlkConf *blk;
     unsigned short sector_mask;
     DeviceState *qdev;
+    VirtIOBlockDataPlane *dataplane;
 } VirtIOBlock;
 
 static VirtIOBlock *to_virtio_blk(VirtIODevice *vdev)
@@ -407,6 +409,14 @@  static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
         .num_writes = 0,
     };
 
+    /* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
+     * dataplane here instead of waiting for .set_status().
+     */
+    if (s->dataplane) {
+        virtio_blk_data_plane_start(s->dataplane);
+        return;
+    }
+
     while ((req = virtio_blk_get_request(s))) {
         virtio_blk_handle_request(req, &mrb);
     }
@@ -446,8 +456,13 @@  static void virtio_blk_dma_restart_cb(void *opaque, int running,
 {
     VirtIOBlock *s = opaque;
 
-    if (!running)
+    if (!running) {
+        /* qemu_drain_all() doesn't know about data plane, quiesce here */
+        if (s->dataplane) {
+            virtio_blk_data_plane_drain(s->dataplane);
+        }
         return;
+    }
 
     if (!s->bh) {
         s->bh = qemu_bh_new(virtio_blk_dma_restart_bh, s);
@@ -541,6 +556,10 @@  static void virtio_blk_set_status(VirtIODevice *vdev, uint8_t status)
     VirtIOBlock *s = to_virtio_blk(vdev);
     uint32_t features;
 
+    if (s->dataplane && !(status & VIRTIO_CONFIG_S_DRIVER)) {
+        virtio_blk_data_plane_stop(s->dataplane);
+    }
+
     if (!(status & VIRTIO_CONFIG_S_DRIVER_OK)) {
         return;
     }
@@ -638,6 +657,10 @@  VirtIODevice *virtio_blk_init(DeviceState *dev, VirtIOBlkConf *blk)
     s->sector_mask = (s->conf->logical_block_size / BDRV_SECTOR_SIZE) - 1;
 
     s->vq = virtio_add_queue(&s->vdev, 128, virtio_blk_handle_output);
+    if (!virtio_blk_data_plane_create(&s->vdev, blk, &s->dataplane)) {
+        virtio_cleanup(&s->vdev);
+        return NULL;
+    }
 
     qemu_add_vm_change_state_handler(virtio_blk_dma_restart_cb, s);
     s->qdev = dev;
@@ -655,6 +678,9 @@  VirtIODevice *virtio_blk_init(DeviceState *dev, VirtIOBlkConf *blk)
 void virtio_blk_exit(VirtIODevice *vdev)
 {
     VirtIOBlock *s = to_virtio_blk(vdev);
+
+    virtio_blk_data_plane_destroy(s->dataplane);
+    s->dataplane = NULL;
     unregister_savevm(s->qdev, "virtio-blk", s);
     blockdev_mark_auto_del(s->bs);
     virtio_cleanup(vdev);
diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 7684ac9..c60b89a 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -896,6 +896,9 @@  static Property virtio_blk_properties[] = {
     DEFINE_PROP_BIT("scsi", VirtIOPCIProxy, blk.scsi, 0, true),
 #endif
     DEFINE_PROP_BIT("ioeventfd", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT, true),
+#ifdef CONFIG_VIRTIO_BLK_DATA_PLANE
+    DEFINE_PROP_BIT("x-data-plane", VirtIOPCIProxy, blk.data_plane, 0, false),
+#endif
     DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2),
     DEFINE_VIRTIO_BLK_FEATURES(VirtIOPCIProxy, host_features),
     DEFINE_PROP_END_OF_LIST(),