Patchwork [3/3] vhost: roll our own cpu map variant

login
register
mail settings
Submitter Michael S. Tsirkin
Date March 28, 2011, 9:14 p.m.
Message ID <a9bae93a939257ae0c01d136aaebd9d488cff071.1301346785.git.mst@redhat.com>
Download mbox | patch
Permalink /patch/88691/
State New
Headers show

Comments

Michael S. Tsirkin - March 28, 2011, 9:14 p.m.
vhost used cpu_physical_memory_map to get the
virtual address for the ring, however,
this will exit on an illegal RAM address.
Since the addresses are guest-controlled, we
shouldn't do that.

Switch to our own variant that uses the vhost
tables and returns an error instead of exiting.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 hw/vhost.c |   66 +++++++++++++++++++++++++++++++++++++++++++++++------------
 1 files changed, 52 insertions(+), 14 deletions(-)
Stefan Hajnoczi - March 29, 2011, 10:53 a.m.
On Mon, Mar 28, 2011 at 10:14 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> vhost used cpu_physical_memory_map to get the
> virtual address for the ring, however,
> this will exit on an illegal RAM address.
> Since the addresses are guest-controlled, we
> shouldn't do that.
>
> Switch to our own variant that uses the vhost
> tables and returns an error instead of exiting.

We should make all of QEMU more robust instead of just vhost.  Perhaps
introduce cpu_physical_memory_map_nofail(...) that aborts like the
current cpu_physical_memory_map() implementation and then make non-hw/
users call that one.  hw/ users should check for failure.

Stefan
Michael S. Tsirkin - March 30, 2011, 4:09 p.m.
On Tue, Mar 29, 2011 at 11:53:54AM +0100, Stefan Hajnoczi wrote:
> On Mon, Mar 28, 2011 at 10:14 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > vhost used cpu_physical_memory_map to get the
> > virtual address for the ring, however,
> > this will exit on an illegal RAM address.
> > Since the addresses are guest-controlled, we
> > shouldn't do that.
> >
> > Switch to our own variant that uses the vhost
> > tables and returns an error instead of exiting.
> 
> We should make all of QEMU more robust instead of just vhost.  Perhaps
> introduce cpu_physical_memory_map_nofail(...) that aborts like the
> current cpu_physical_memory_map() implementation and then make non-hw/
> users call that one.  hw/ users should check for failure.
> 
> Stefan

Yea, well ... at least vhost-net wants to also check
it is given a ram address, not some other physical address.
We could generally replace the memory management in vhost-net
by some other logic, when that's done this one can
go away as well.
Stefan Hajnoczi - March 30, 2011, 4:26 p.m.
On Wed, Mar 30, 2011 at 5:09 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Tue, Mar 29, 2011 at 11:53:54AM +0100, Stefan Hajnoczi wrote:
>> On Mon, Mar 28, 2011 at 10:14 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
>> > vhost used cpu_physical_memory_map to get the
>> > virtual address for the ring, however,
>> > this will exit on an illegal RAM address.
>> > Since the addresses are guest-controlled, we
>> > shouldn't do that.
>> >
>> > Switch to our own variant that uses the vhost
>> > tables and returns an error instead of exiting.
>>
>> We should make all of QEMU more robust instead of just vhost.  Perhaps
>> introduce cpu_physical_memory_map_nofail(...) that aborts like the
>> current cpu_physical_memory_map() implementation and then make non-hw/
>> users call that one.  hw/ users should check for failure.
>>
>> Stefan
>
> Yea, well ... at least vhost-net wants to also check
> it is given a ram address, not some other physical address.
> We could generally replace the memory management in vhost-net
> by some other logic, when that's done this one can
> go away as well.

Sounds like you do not want to refactor physical memory access for
non-vhost.  Fair enough but we have to do it sooner or later in order
to make all of QEMU more robust.  If vhost-net is protected but the
IDE CD-ROM and virtio-blk disk still have issues then we haven't
reached our goal yet.  Any way I can convince you to do a generic API?
:)

Stefan
Michael S. Tsirkin - March 30, 2011, 4:59 p.m.
On Wed, Mar 30, 2011 at 05:26:22PM +0100, Stefan Hajnoczi wrote:
> On Wed, Mar 30, 2011 at 5:09 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > On Tue, Mar 29, 2011 at 11:53:54AM +0100, Stefan Hajnoczi wrote:
> >> On Mon, Mar 28, 2011 at 10:14 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> >> > vhost used cpu_physical_memory_map to get the
> >> > virtual address for the ring, however,
> >> > this will exit on an illegal RAM address.
> >> > Since the addresses are guest-controlled, we
> >> > shouldn't do that.
> >> >
> >> > Switch to our own variant that uses the vhost
> >> > tables and returns an error instead of exiting.
> >>
> >> We should make all of QEMU more robust instead of just vhost.  Perhaps
> >> introduce cpu_physical_memory_map_nofail(...) that aborts like the
> >> current cpu_physical_memory_map() implementation and then make non-hw/
> >> users call that one.  hw/ users should check for failure.
> >>
> >> Stefan
> >
> > Yea, well ... at least vhost-net wants to also check
> > it is given a ram address, not some other physical address.
> > We could generally replace the memory management in vhost-net
> > by some other logic, when that's done this one can
> > go away as well.
> 
> Sounds like you do not want to refactor physical memory access for
> non-vhost.  Fair enough but we have to do it sooner or later in order
> to make all of QEMU more robust.  If vhost-net is protected but the
> IDE CD-ROM and virtio-blk disk still have issues then we haven't
> reached our goal yet.  Any way I can convince you to do a generic API?
> :)
> 
> Stefan

If you are talking about splitting real ram from non ram
and creating a generic API for that, you don't need to convince me,
but I can't commit to implementing it right now.
Stefan Hajnoczi - March 30, 2011, 5:59 p.m.
On Wed, Mar 30, 2011 at 5:59 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Wed, Mar 30, 2011 at 05:26:22PM +0100, Stefan Hajnoczi wrote:
>> On Wed, Mar 30, 2011 at 5:09 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
>> > On Tue, Mar 29, 2011 at 11:53:54AM +0100, Stefan Hajnoczi wrote:
>> >> On Mon, Mar 28, 2011 at 10:14 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
>> >> > vhost used cpu_physical_memory_map to get the
>> >> > virtual address for the ring, however,
>> >> > this will exit on an illegal RAM address.
>> >> > Since the addresses are guest-controlled, we
>> >> > shouldn't do that.
>> >> >
>> >> > Switch to our own variant that uses the vhost
>> >> > tables and returns an error instead of exiting.
>> >>
>> >> We should make all of QEMU more robust instead of just vhost.  Perhaps
>> >> introduce cpu_physical_memory_map_nofail(...) that aborts like the
>> >> current cpu_physical_memory_map() implementation and then make non-hw/
>> >> users call that one.  hw/ users should check for failure.
>> >>
>> >> Stefan
>> >
>> > Yea, well ... at least vhost-net wants to also check
>> > it is given a ram address, not some other physical address.
>> > We could generally replace the memory management in vhost-net
>> > by some other logic, when that's done this one can
>> > go away as well.
>>
>> Sounds like you do not want to refactor physical memory access for
>> non-vhost.  Fair enough but we have to do it sooner or later in order
>> to make all of QEMU more robust.  If vhost-net is protected but the
>> IDE CD-ROM and virtio-blk disk still have issues then we haven't
>> reached our goal yet.  Any way I can convince you to do a generic API?
>> :)
>>
>> Stefan
>
> If you are talking about splitting real ram from non ram
> and creating a generic API for that, you don't need to convince me,
> but I can't commit to implementing it right now.

Okay, userspace virtio will be able to use it as well in the future.

Stefan

Patch

diff --git a/hw/vhost.c b/hw/vhost.c
index c17a831..5fd09b5 100644
--- a/hw/vhost.c
+++ b/hw/vhost.c
@@ -271,6 +271,44 @@  static inline void vhost_dev_log_resize(struct vhost_dev* dev, uint64_t size)
     dev->log_size = size;
 }
 
+/* Same as cpu_physical_memory_map but doesn't allocate,
+ * doesn't use a bounce buffer, checks input for errors such
+ * as wrap-around, and does not exit on failure. */
+static void *vhost_memory_map(struct vhost_dev *dev,
+                              uint64_t addr,
+                              uint64_t *size,
+                              int is_write)
+{
+    int i;
+    if (addr + *size < addr) {
+        *size = -addr;
+    }
+    for (i = 0; i < dev->mem->nregions; ++i) {
+        struct vhost_memory_region *reg = dev->mem->regions + i;
+        uint64_t rlast, mlast, userspace_addr;
+        if (!range_covers_byte(reg->guest_phys_addr, reg->memory_size, addr)) {
+            continue;
+        }
+        rlast = range_get_last(reg->guest_phys_addr, reg->memory_size);
+        mlast = range_get_last(addr, *size);
+        if (rlast < mlast) {
+            *size -= (mlast - rlast);
+        }
+        userspace_addr = reg->userspace_addr + addr - reg->guest_phys_addr;
+        if ((unsigned long)userspace_addr != userspace_addr) {
+            return NULL;
+        }
+        return (void *)((unsigned long)userspace_addr);
+    }
+    return NULL;
+}
+
+/* Placeholder to keep the API consistent with cpu_physical_memory_unmap. */
+static void vhost_memory_unmap(void *buffer, uint64_t len,
+                               int is_write, uint64_t access_len)
+{
+}
+
 static int vhost_verify_ring_mappings(struct vhost_dev *dev,
                                       uint64_t start_addr,
                                       uint64_t size)
@@ -285,7 +323,7 @@  static int vhost_verify_ring_mappings(struct vhost_dev *dev,
             continue;
         }
         l = vq->ring_size;
-        p = cpu_physical_memory_map(vq->ring_phys, &l, 1);
+        p = vhost_memory_map(dev, vq->ring_phys, &l, 1);
         if (!p || l != vq->ring_size) {
             virtio_error(dev->vdev, "Unable to map ring buffer for ring %d\n", i);
             return -ENOMEM;
@@ -294,7 +332,7 @@  static int vhost_verify_ring_mappings(struct vhost_dev *dev,
             virtio_error(dev->vdev, "Ring buffer relocated for ring %d\n", i);
             return -EBUSY;
         }
-        cpu_physical_memory_unmap(p, l, 0, 0);
+        vhost_memory_unmap(p, l, 0, 0);
     }
     return 0;
 }
@@ -480,21 +518,21 @@  static int vhost_virtqueue_init(struct vhost_dev *dev,
 
     s = l = virtio_queue_get_desc_size(vdev, idx);
     a = virtio_queue_get_desc_addr(vdev, idx);
-    vq->desc = cpu_physical_memory_map(a, &l, 0);
+    vq->desc = vhost_memory_map(dev, a, &l, 0);
     if (!vq->desc || l != s) {
         r = -ENOMEM;
         goto fail_alloc_desc;
     }
     s = l = virtio_queue_get_avail_size(vdev, idx);
     a = virtio_queue_get_avail_addr(vdev, idx);
-    vq->avail = cpu_physical_memory_map(a, &l, 0);
+    vq->avail = vhost_memory_map(dev, a, &l, 0);
     if (!vq->avail || l != s) {
         r = -ENOMEM;
         goto fail_alloc_avail;
     }
     vq->used_size = s = l = virtio_queue_get_used_size(vdev, idx);
     vq->used_phys = a = virtio_queue_get_used_addr(vdev, idx);
-    vq->used = cpu_physical_memory_map(a, &l, 1);
+    vq->used = vhost_memory_map(dev, a, &l, 1);
     if (!vq->used || l != s) {
         r = -ENOMEM;
         goto fail_alloc_used;
@@ -502,7 +540,7 @@  static int vhost_virtqueue_init(struct vhost_dev *dev,
 
     vq->ring_size = s = l = virtio_queue_get_ring_size(vdev, idx);
     vq->ring_phys = a = virtio_queue_get_ring_addr(vdev, idx);
-    vq->ring = cpu_physical_memory_map(a, &l, 1);
+    vq->ring = vhost_memory_map(dev, a, &l, 1);
     if (!vq->ring || l != s) {
         r = -ENOMEM;
         goto fail_alloc_ring;
@@ -540,16 +578,16 @@  fail_kick:
     vdev->binding->set_host_notifier(vdev->binding_opaque, idx, false);
 fail_host_notifier:
 fail_alloc:
-    cpu_physical_memory_unmap(vq->ring, virtio_queue_get_ring_size(vdev, idx),
+    vhost_memory_unmap(vq->ring, virtio_queue_get_ring_size(vdev, idx),
                               0, 0);
 fail_alloc_ring:
-    cpu_physical_memory_unmap(vq->used, virtio_queue_get_used_size(vdev, idx),
+    vhost_memory_unmap(vq->used, virtio_queue_get_used_size(vdev, idx),
                               0, 0);
 fail_alloc_used:
-    cpu_physical_memory_unmap(vq->avail, virtio_queue_get_avail_size(vdev, idx),
+    vhost_memory_unmap(vq->avail, virtio_queue_get_avail_size(vdev, idx),
                               0, 0);
 fail_alloc_avail:
-    cpu_physical_memory_unmap(vq->desc, virtio_queue_get_desc_size(vdev, idx),
+    vhost_memory_unmap(vq->desc, virtio_queue_get_desc_size(vdev, idx),
                               0, 0);
 fail_alloc_desc:
     return r;
@@ -577,13 +615,13 @@  static void vhost_virtqueue_cleanup(struct vhost_dev *dev,
     }
     virtio_queue_set_last_avail_idx(vdev, idx, state.num);
     assert (r >= 0);
-    cpu_physical_memory_unmap(vq->ring, virtio_queue_get_ring_size(vdev, idx),
+    vhost_memory_unmap(vq->ring, virtio_queue_get_ring_size(vdev, idx),
                               0, virtio_queue_get_ring_size(vdev, idx));
-    cpu_physical_memory_unmap(vq->used, virtio_queue_get_used_size(vdev, idx),
+    vhost_memory_unmap(vq->used, virtio_queue_get_used_size(vdev, idx),
                               1, virtio_queue_get_used_size(vdev, idx));
-    cpu_physical_memory_unmap(vq->avail, virtio_queue_get_avail_size(vdev, idx),
+    vhost_memory_unmap(vq->avail, virtio_queue_get_avail_size(vdev, idx),
                               0, virtio_queue_get_avail_size(vdev, idx));
-    cpu_physical_memory_unmap(vq->desc, virtio_queue_get_desc_size(vdev, idx),
+    vhost_memory_unmap(vq->desc, virtio_queue_get_desc_size(vdev, idx),
                               0, virtio_queue_get_desc_size(vdev, idx));
 }