Message ID | 1355144985-12897-13-git-send-email-stefanha@redhat.com |
---|---|
State | New |
Headers | show |
On Mon, Dec 10, 2012 at 02:09:45PM +0100, Stefan Hajnoczi wrote: > @@ -33,6 +34,7 @@ typedef struct VirtIOBlock > VirtIOBlkConf *blk; > unsigned short sector_mask; > DeviceState *qdev; > + VirtIOBlockDataPlane *dataplane; > } VirtIOBlock; > > static VirtIOBlock *to_virtio_blk(VirtIODevice *vdev) > @@ -407,6 +409,14 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq) > .num_writes = 0, > }; > > + /* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start > + * dataplane here instead of waiting for .set_status(). > + */ By the way which guests are these? > + if (s->dataplane) { > + virtio_blk_data_plane_start(s->dataplane); > + return; > + } > + By the way it's chunk such as this that I meant: it's not compiled out even if dataplane is disabled by configure. Naither is the extra field in the struct. > while ((req = virtio_blk_get_request(s))) { > virtio_blk_handle_request(req, &mrb); > }
On Sun, Dec 16, 2012 at 06:08:53PM +0200, Michael S. Tsirkin wrote: > On Mon, Dec 10, 2012 at 02:09:45PM +0100, Stefan Hajnoczi wrote: > > @@ -33,6 +34,7 @@ typedef struct VirtIOBlock > > VirtIOBlkConf *blk; > > unsigned short sector_mask; > > DeviceState *qdev; > > + VirtIOBlockDataPlane *dataplane; > > } VirtIOBlock; > > > > static VirtIOBlock *to_virtio_blk(VirtIODevice *vdev) > > @@ -407,6 +409,14 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq) > > .num_writes = 0, > > }; > > > > + /* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start > > + * dataplane here instead of waiting for .set_status(). > > + */ > > By the way which guests are these? I ran a Windows 8 guest today with build 48 virtio-win drivers. It notifies before the device gets its .set_status() callback invoked. But I could swear I've seen Linux guests do this too. > > + if (s->dataplane) { > > + virtio_blk_data_plane_start(s->dataplane); > > + return; > > + } > > + > > By the way it's chunk such as this that I meant: it's not > compiled out even if dataplane is disabled by configure. > Naither is the extra field in the struct. Okay. Stefan
On Tue, Dec 18, 2012 at 03:57:17PM +0100, Stefan Hajnoczi wrote: > > > @@ -407,6 +409,14 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq) > > > .num_writes = 0, > > > }; > > > > > > + /* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start > > > + * dataplane here instead of waiting for .set_status(). > > > + */ > > > > By the way which guests are these? > > I ran a Windows 8 guest today with build 48 virtio-win drivers. It > notifies before the device gets its .set_status() callback invoked. > But I could swear I've seen Linux guests do this too. That's very broken. But looking at linux drivers it also seems linux guests do this even today. We have: err = drv->probe(dev); if (err) add_status(dev, VIRTIO_CONFIG_S_FAILED); else { add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK); if (drv->scan) drv->scan(dev); } this means that unless drivers implement scan() they will make device active before DRIVER_OK is written as the result linux can access it and kick. And almost no drivers implement scan. Nasty. Rusty, what do you think? Worth fixing? It does mean that for now we are stuck with a work around, but I think we need it in virtio core in qemu, it's not dataplane specific.
"Michael S. Tsirkin" <mst@redhat.com> writes: > On Tue, Dec 18, 2012 at 03:57:17PM +0100, Stefan Hajnoczi wrote: >> > > @@ -407,6 +409,14 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq) >> > > .num_writes = 0, >> > > }; >> > > >> > > + /* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start >> > > + * dataplane here instead of waiting for .set_status(). >> > > + */ >> > >> > By the way which guests are these? >> >> I ran a Windows 8 guest today with build 48 virtio-win drivers. It >> notifies before the device gets its .set_status() callback invoked. >> But I could swear I've seen Linux guests do this too. > > > That's very broken. But looking at linux drivers it also > seems linux guests do this even today. > We have: > > err = drv->probe(dev); > if (err) > add_status(dev, VIRTIO_CONFIG_S_FAILED); > else { > add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK); > if (drv->scan) > drv->scan(dev); > } > > this means that unless drivers implement scan() they > will make device active before DRIVER_OK is written > as the result linux can access it and kick. > And almost no drivers implement scan. > Nasty. Yes, that's true. But as long as they have completed feature negotiation, we allow this (that's why we tool feature negotiation out of the drivers). For example, filling an input virtqueue may well mean we kick the vq, and almost every device does this. virtio_block is the worst: add_disk() does partition scanning. > Rusty, what do you think? Worth fixing? If we tried, I'm fairly sure things would slip through. My feeling has been that we should not rely on the status to indicate readiness. > It does mean that for now we are stuck > with a work around, but I think we need it > in virtio core in qemu, it's not dataplane > specific. Yes, it's a general problem. Cheers, Rusty.
diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c index fabf387..4e7ef64 100644 --- a/hw/virtio-blk.c +++ b/hw/virtio-blk.c @@ -17,6 +17,7 @@ #include "hw/block-common.h" #include "blockdev.h" #include "virtio-blk.h" +#include "hw/dataplane/virtio-blk.h" #include "scsi-defs.h" #ifdef __linux__ # include <scsi/sg.h> @@ -33,6 +34,7 @@ typedef struct VirtIOBlock VirtIOBlkConf *blk; unsigned short sector_mask; DeviceState *qdev; + VirtIOBlockDataPlane *dataplane; } VirtIOBlock; static VirtIOBlock *to_virtio_blk(VirtIODevice *vdev) @@ -407,6 +409,14 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq) .num_writes = 0, }; + /* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start + * dataplane here instead of waiting for .set_status(). + */ + if (s->dataplane) { + virtio_blk_data_plane_start(s->dataplane); + return; + } + while ((req = virtio_blk_get_request(s))) { virtio_blk_handle_request(req, &mrb); } @@ -446,8 +456,13 @@ static void virtio_blk_dma_restart_cb(void *opaque, int running, { VirtIOBlock *s = opaque; - if (!running) + if (!running) { + /* qemu_drain_all() doesn't know about data plane, quiesce here */ + if (s->dataplane) { + virtio_blk_data_plane_drain(s->dataplane); + } return; + } if (!s->bh) { s->bh = qemu_bh_new(virtio_blk_dma_restart_bh, s); @@ -541,6 +556,10 @@ static void virtio_blk_set_status(VirtIODevice *vdev, uint8_t status) VirtIOBlock *s = to_virtio_blk(vdev); uint32_t features; + if (s->dataplane && !(status & VIRTIO_CONFIG_S_DRIVER)) { + virtio_blk_data_plane_stop(s->dataplane); + } + if (!(status & VIRTIO_CONFIG_S_DRIVER_OK)) { return; } @@ -638,6 +657,10 @@ VirtIODevice *virtio_blk_init(DeviceState *dev, VirtIOBlkConf *blk) s->sector_mask = (s->conf->logical_block_size / BDRV_SECTOR_SIZE) - 1; s->vq = virtio_add_queue(&s->vdev, 128, virtio_blk_handle_output); + if (!virtio_blk_data_plane_create(&s->vdev, blk, &s->dataplane)) { + virtio_cleanup(&s->vdev); + return NULL; + } qemu_add_vm_change_state_handler(virtio_blk_dma_restart_cb, s); s->qdev = dev; @@ -655,6 +678,9 @@ VirtIODevice *virtio_blk_init(DeviceState *dev, VirtIOBlkConf *blk) void virtio_blk_exit(VirtIODevice *vdev) { VirtIOBlock *s = to_virtio_blk(vdev); + + virtio_blk_data_plane_destroy(s->dataplane); + s->dataplane = NULL; unregister_savevm(s->qdev, "virtio-blk", s); blockdev_mark_auto_del(s->bs); virtio_cleanup(vdev); diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c index 7684ac9..c60b89a 100644 --- a/hw/virtio-pci.c +++ b/hw/virtio-pci.c @@ -896,6 +896,9 @@ static Property virtio_blk_properties[] = { DEFINE_PROP_BIT("scsi", VirtIOPCIProxy, blk.scsi, 0, true), #endif DEFINE_PROP_BIT("ioeventfd", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT, true), +#ifdef CONFIG_VIRTIO_BLK_DATA_PLANE + DEFINE_PROP_BIT("x-data-plane", VirtIOPCIProxy, blk.data_plane, 0, false), +#endif DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2), DEFINE_VIRTIO_BLK_FEATURES(VirtIOPCIProxy, host_features), DEFINE_PROP_END_OF_LIST(),
The virtio-blk-data-plane feature is easy to integrate into hw/virtio-blk.c. The data plane can be started and stopped similar to vhost-net. Users can take advantage of the virtio-blk-data-plane feature using the new -device virtio-blk-pci,x-data-plane=on property. The x-data-plane name was chosen because at this stage the feature is experimental and likely to see changes in the future. If the VM configuration does not support virtio-blk-data-plane an error message is printed. Although we could fall back to regular virtio-blk, I prefer the explicit approach since it prompts the user to fix their configuration if they want the performance benefit of virtio-blk-data-plane. Limitations: * Only format=raw is supported * Live migration is not supported * Block jobs, hot unplug, and other operations fail with -EBUSY * I/O throttling limits are ignored * Only Linux hosts are supported due to Linux AIO usage Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> --- hw/virtio-blk.c | 28 +++++++++++++++++++++++++++- hw/virtio-pci.c | 3 +++ 2 files changed, 30 insertions(+), 1 deletion(-)