virtio: destroy region cache during reset

Message ID	1488876478-6889-1-git-send-email-jasowang@redhat.com
State	New
Headers	show Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org> From: Jason Wang <jasowang@redhat.com> To: mst@redhat.com, qemu-devel@nongnu.org Date: Tue, 7 Mar 2017 16:47:58 +0800 Message-Id: <1488876478-6889-1-git-send-email-jasowang@redhat.com> Subject: [Qemu-devel] [PATCH] virtio: destroy region cache during reset Precedence: list Cc: pbonzini@redhat.com, Jason Wang <jasowang@redhat.com>, peterx@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>

Jason Wang March 7, 2017, 8:47 a.m. UTC

We don't destroy region cache during reset which can make the maps
of previous driver leaked to a buggy or malicious driver that don't
set vring address before starting to use the device. Fix this by
destroy the region cache during reset and validate it before trying to
use them. While at it, also validate address_space_cache_init() during
virtio_init_region_cache() to make sure we have a correct region
cache.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 hw/virtio/virtio.c | 88 ++++++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 76 insertions(+), 12 deletions(-)

Cornelia Huck March 7, 2017, 10:16 a.m. UTC | #1

On Tue,  7 Mar 2017 16:47:58 +0800
Jason Wang <jasowang@redhat.com> wrote:

> We don't destroy region cache during reset which can make the maps
> of previous driver leaked to a buggy or malicious driver that don't
> set vring address before starting to use the device. Fix this by
> destroy the region cache during reset and validate it before trying to
> use them. While at it, also validate address_space_cache_init() during
> virtio_init_region_cache() to make sure we have a correct region
> cache.
> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  hw/virtio/virtio.c | 88 ++++++++++++++++++++++++++++++++++++++++++++++--------
>  1 file changed, 76 insertions(+), 12 deletions(-)
> 
> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index 09f4cf4..90324f6 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -131,6 +131,7 @@ static void virtio_init_region_cache(VirtIODevice *vdev, int n)
>      VRingMemoryRegionCaches *new;
>      hwaddr addr, size;
>      int event_size;
> +    int64_t len;
> 
>      event_size = virtio_vdev_has_feature(vq->vdev, VIRTIO_RING_F_EVENT_IDX) ? 2 : 0;
> 
> @@ -140,21 +141,41 @@ static void virtio_init_region_cache(VirtIODevice *vdev, int n)
>      }
>      new = g_new0(VRingMemoryRegionCaches, 1);
>      size = virtio_queue_get_desc_size(vdev, n);
> -    address_space_cache_init(&new->desc, vdev->dma_as,
> -                             addr, size, false);
> +    len = address_space_cache_init(&new->desc, vdev->dma_as,
> +                                   addr, size, false);
> +    if (len < size) {
> +        virtio_error(vdev, "Cannot map desc");
> +        goto err_desc;
> +    }
> 
>      size = virtio_queue_get_used_size(vdev, n) + event_size;
> -    address_space_cache_init(&new->used, vdev->dma_as,
> -                             vq->vring.used, size, true);
> +    len = address_space_cache_init(&new->used, vdev->dma_as,
> +                                   vq->vring.used, size, true);
> +    if (len < size) {
> +        virtio_error(vdev, "Cannot map used");
> +        goto err_used;
> +    }
> 
>      size = virtio_queue_get_avail_size(vdev, n) + event_size;
> -    address_space_cache_init(&new->avail, vdev->dma_as,
> -                             vq->vring.avail, size, false);
> +    len = address_space_cache_init(&new->avail, vdev->dma_as,
> +                                   vq->vring.avail, size, false);
> +    if (len < size) {
> +        virtio_error(vdev, "Cannot map avail");
> +        goto err_avail;
> +    }
> 
>      atomic_rcu_set(&vq->vring.caches, new);
>      if (old) {
>          call_rcu(old, virtio_free_region_cache, rcu);
>      }
> +    return;
> +
> +err_avail:
> +    address_space_cache_destroy(&new->used);
> +err_used:
> +    address_space_cache_destroy(&new->desc);
> +err_desc:
> +    g_free(new);
>  }

I think it would be more readable if you moved adding this check (which
is a good idea) into a separate patch.

> 
>  /* virt queue functions */
> @@ -190,6 +211,10 @@ static inline uint16_t vring_avail_flags(VirtQueue *vq)
>  {
>      VRingMemoryRegionCaches *caches = atomic_rcu_read(&vq->vring.caches);
>      hwaddr pa = offsetof(VRingAvail, flags);
> +    if (!caches) {
> +        virtio_error(vq->vdev, "Cannot map avail flags");

I'm not sure that virtio_error is the right thing here; ending up in
this function with !caches indicates an error in our logic. An assert
might be better (and I hope we can sort out all of those errors exposed
by the introduction of region caches for 2.9...)

> +        return 0;
> +    }
>      return virtio_lduw_phys_cached(vq->vdev, &caches->avail, pa);
>  }
> 
> @@ -198,6 +223,10 @@ static inline uint16_t vring_avail_idx(VirtQueue *vq)
>  {
>      VRingMemoryRegionCaches *caches = atomic_rcu_read(&vq->vring.caches);
>      hwaddr pa = offsetof(VRingAvail, idx);
> +    if (!caches) {
> +        virtio_error(vq->vdev, "Cannot map avail idx");
> +        return vq->shadow_avail_idx;
> +    }
>      vq->shadow_avail_idx = virtio_lduw_phys_cached(vq->vdev, &caches->avail, pa);
>      return vq->shadow_avail_idx;
>  }
> @@ -207,6 +236,10 @@ static inline uint16_t vring_avail_ring(VirtQueue *vq, int i)
>  {
>      VRingMemoryRegionCaches *caches = atomic_rcu_read(&vq->vring.caches);
>      hwaddr pa = offsetof(VRingAvail, ring[i]);
> +    if (!caches) {
> +        virtio_error(vq->vdev, "Cannot map avail ring");
> +        return 0;
> +    }
>      return virtio_lduw_phys_cached(vq->vdev, &caches->avail, pa);
>  }
> 
> @@ -222,6 +255,10 @@ static inline void vring_used_write(VirtQueue *vq, VRingUsedElem *uelem,
>  {
>      VRingMemoryRegionCaches *caches = atomic_rcu_read(&vq->vring.caches);
>      hwaddr pa = offsetof(VRingUsed, ring[i]);
> +    if (!caches) {
> +        virtio_error(vq->vdev, "Cannot map used ring");
> +        return;
> +    }
>      virtio_tswap32s(vq->vdev, &uelem->id);
>      virtio_tswap32s(vq->vdev, &uelem->len);
>      address_space_write_cached(&caches->used, pa, uelem, sizeof(VRingUsedElem));
> @@ -233,6 +270,10 @@ static uint16_t vring_used_idx(VirtQueue *vq)
>  {
>      VRingMemoryRegionCaches *caches = atomic_rcu_read(&vq->vring.caches);
>      hwaddr pa = offsetof(VRingUsed, idx);
> +    if (!caches) {
> +        virtio_error(vq->vdev, "Cannot map used ring");
> +        return 0;
> +    }
>      return virtio_lduw_phys_cached(vq->vdev, &caches->used, pa);
>  }
> 
> @@ -241,6 +282,10 @@ static inline void vring_used_idx_set(VirtQueue *vq, uint16_t val)
>  {
>      VRingMemoryRegionCaches *caches = atomic_rcu_read(&vq->vring.caches);
>      hwaddr pa = offsetof(VRingUsed, idx);
> +    if (!caches) {
> +        virtio_error(vq->vdev, "Cannot map used idx");
> +        return;
> +    }
>      virtio_stw_phys_cached(vq->vdev, &caches->used, pa, val);
>      address_space_cache_invalidate(&caches->used, pa, sizeof(val));
>      vq->used_idx = val;
> @@ -254,6 +299,10 @@ static inline void vring_used_flags_set_bit(VirtQueue *vq, int mask)
>      hwaddr pa = offsetof(VRingUsed, flags);
>      uint16_t flags = virtio_lduw_phys_cached(vq->vdev, &caches->used, pa);
> 
> +    if (!caches) {
> +        virtio_error(vq->vdev, "Cannot map used flags");

Regardless of whether using virtio_error here is fine: caches was
already dereferenced above...

> +        return;
> +    }
>      virtio_stw_phys_cached(vdev, &caches->used, pa, flags | mask);
>      address_space_cache_invalidate(&caches->used, pa, sizeof(flags));
>  }
> @@ -266,6 +315,10 @@ static inline void vring_used_flags_unset_bit(VirtQueue *vq, int mask)
>      hwaddr pa = offsetof(VRingUsed, flags);
>      uint16_t flags = virtio_lduw_phys_cached(vq->vdev, &caches->used, pa);
> 
> +    if (!caches) {
> +        virtio_error(vq->vdev, "Cannot map used flags");

dito

> +        return;
> +    }
>      virtio_stw_phys_cached(vdev, &caches->used, pa, flags & ~mask);
>      address_space_cache_invalidate(&caches->used, pa, sizeof(flags));
>  }
> @@ -280,6 +333,10 @@ static inline void vring_set_avail_event(VirtQueue *vq, uint16_t val)
>      }
> 
>      caches = atomic_rcu_read(&vq->vring.caches);
> +    if (!caches) {
> +        virtio_error(vq->vdev, "Cannot map avail event");
> +        return;
> +    }
>      pa = offsetof(VRingUsed, ring[vq->vring.num]);
>      virtio_stw_phys_cached(vq->vdev, &caches->used, pa, val);
>      address_space_cache_invalidate(&caches->used, pa, sizeof(val));
> @@ -552,7 +609,7 @@ void virtqueue_get_avail_bytes(VirtQueue *vq, unsigned int *in_bytes,
> 
>      max = vq->vring.num;
>      caches = atomic_rcu_read(&vq->vring.caches);
> -    if (caches->desc.len < max * sizeof(VRingDesc)) {
> +    if (!caches || caches->desc.len < max * sizeof(VRingDesc)) {
>          virtio_error(vdev, "Cannot map descriptor ring");
>          goto err;
>      }
> @@ -819,7 +876,7 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz)
>      i = head;
> 
>      caches = atomic_rcu_read(&vq->vring.caches);
> -    if (caches->desc.len < max * sizeof(VRingDesc)) {
> +    if (!caches || caches->desc.len < max * sizeof(VRingDesc)) {
>          virtio_error(vdev, "Cannot map descriptor ring");
>          goto done;
>      }
> @@ -1117,6 +1174,15 @@ static enum virtio_device_endian virtio_current_cpu_endian(void)
>      }
>  }
> 
> +static void virtio_virtqueue_reset_region_cache(struct VirtQueue *vq)
> +{
> +    VRingMemoryRegionCaches *caches;
> +
> +    caches = atomic_read(&vq->vring.caches);
> +    atomic_set(&vq->vring.caches, NULL);
> +    virtio_free_region_cache(caches);

Shouldn't this use rcu to free it? Unconditionally setting caches to
NULL feels wrong...

> +}
> +
>  void virtio_reset(void *opaque)
>  {
>      VirtIODevice *vdev = opaque;
> @@ -1157,6 +1223,7 @@ void virtio_reset(void *opaque)
>          vdev->vq[i].notification = true;
>          vdev->vq[i].vring.num = vdev->vq[i].vring.num_default;
>          vdev->vq[i].inuse = 0;
> +        virtio_virtqueue_reset_region_cache(&vdev->vq[i]);

...especially as you call it in a reset context here.

>      }
>  }
> 
> @@ -2451,13 +2518,10 @@ static void virtio_device_free_virtqueues(VirtIODevice *vdev)
>      }
> 
>      for (i = 0; i < VIRTIO_QUEUE_MAX; i++) {
> -        VRingMemoryRegionCaches *caches;
>          if (vdev->vq[i].vring.num == 0) {
>              break;
>          }
> -        caches = atomic_read(&vdev->vq[i].vring.caches);
> -        atomic_set(&vdev->vq[i].vring.caches, NULL);
> -        virtio_free_region_cache(caches);
> +        virtio_virtqueue_reset_region_cache(&vdev->vq[i]);

OTOH, immediate destruction may still be called for during device
finalization.

>      }
>      g_free(vdev->vq);
>  }

Paolo Bonzini March 7, 2017, 10:55 a.m. UTC | #2

On 07/03/2017 09:47, Jason Wang wrote:
> We don't destroy region cache during reset which can make the maps
> of previous driver leaked to a buggy or malicious driver that don't
> set vring address before starting to use the device.

I'm still not sure as to how this can happen.  Reset does clear
desc/used/avail, which should then be checked before accessing the caches.

Paolo

> Fix this by
> destroy the region cache during reset and validate it before trying to
> use them. While at it, also validate address_space_cache_init() during
> virtio_init_region_cache() to make sure we have a correct region
> cache.

Jason Wang March 8, 2017, 3:18 a.m. UTC | #3

On 2017年03月07日 18:16, Cornelia Huck wrote:
> On Tue,  7 Mar 2017 16:47:58 +0800
> Jason Wang <jasowang@redhat.com> wrote:
>
>> We don't destroy region cache during reset which can make the maps
>> of previous driver leaked to a buggy or malicious driver that don't
>> set vring address before starting to use the device. Fix this by
>> destroy the region cache during reset and validate it before trying to
>> use them. While at it, also validate address_space_cache_init() during
>> virtio_init_region_cache() to make sure we have a correct region
>> cache.
>>
>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>> ---
>>   hw/virtio/virtio.c | 88 ++++++++++++++++++++++++++++++++++++++++++++++--------
>>   1 file changed, 76 insertions(+), 12 deletions(-)
>>
>> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
>> index 09f4cf4..90324f6 100644
>> --- a/hw/virtio/virtio.c
>> +++ b/hw/virtio/virtio.c
>> @@ -131,6 +131,7 @@ static void virtio_init_region_cache(VirtIODevice *vdev, int n)
>>       VRingMemoryRegionCaches *new;
>>       hwaddr addr, size;
>>       int event_size;
>> +    int64_t len;
>>
>>       event_size = virtio_vdev_has_feature(vq->vdev, VIRTIO_RING_F_EVENT_IDX) ? 2 : 0;
>>
>> @@ -140,21 +141,41 @@ static void virtio_init_region_cache(VirtIODevice *vdev, int n)
>>       }
>>       new = g_new0(VRingMemoryRegionCaches, 1);
>>       size = virtio_queue_get_desc_size(vdev, n);
>> -    address_space_cache_init(&new->desc, vdev->dma_as,
>> -                             addr, size, false);
>> +    len = address_space_cache_init(&new->desc, vdev->dma_as,
>> +                                   addr, size, false);
>> +    if (len < size) {
>> +        virtio_error(vdev, "Cannot map desc");
>> +        goto err_desc;
>> +    }
>>
>>       size = virtio_queue_get_used_size(vdev, n) + event_size;
>> -    address_space_cache_init(&new->used, vdev->dma_as,
>> -                             vq->vring.used, size, true);
>> +    len = address_space_cache_init(&new->used, vdev->dma_as,
>> +                                   vq->vring.used, size, true);
>> +    if (len < size) {
>> +        virtio_error(vdev, "Cannot map used");
>> +        goto err_used;
>> +    }
>>
>>       size = virtio_queue_get_avail_size(vdev, n) + event_size;
>> -    address_space_cache_init(&new->avail, vdev->dma_as,
>> -                             vq->vring.avail, size, false);
>> +    len = address_space_cache_init(&new->avail, vdev->dma_as,
>> +                                   vq->vring.avail, size, false);
>> +    if (len < size) {
>> +        virtio_error(vdev, "Cannot map avail");
>> +        goto err_avail;
>> +    }
>>
>>       atomic_rcu_set(&vq->vring.caches, new);
>>       if (old) {
>>           call_rcu(old, virtio_free_region_cache, rcu);
>>       }
>> +    return;
>> +
>> +err_avail:
>> +    address_space_cache_destroy(&new->used);
>> +err_used:
>> +    address_space_cache_destroy(&new->desc);
>> +err_desc:
>> +    g_free(new);
>>   }
> I think it would be more readable if you moved adding this check (which
> is a good idea) into a separate patch.

Ok.

>>   /* virt queue functions */
>> @@ -190,6 +211,10 @@ static inline uint16_t vring_avail_flags(VirtQueue *vq)
>>   {
>>       VRingMemoryRegionCaches *caches = atomic_rcu_read(&vq->vring.caches);
>>       hwaddr pa = offsetof(VRingAvail, flags);
>> +    if (!caches) {
>> +        virtio_error(vq->vdev, "Cannot map avail flags");
> I'm not sure that virtio_error is the right thing here; ending up in
> this function with !caches indicates an error in our logic.

Probably not, this can be triggered by buggy guest.

> An assert
> might be better (and I hope we can sort out all of those errors exposed
> by the introduction of region caches for 2.9...)

I thought we should avoid assert as much as possible in this case. But 
if you and maintainer want an assert, it's also fine.

>
>> +        return 0;
>> +    }
>>       return virtio_lduw_phys_cached(vq->vdev, &caches->avail, pa);
>>   }
>>
>> @@ -198,6 +223,10 @@ static inline uint16_t vring_avail_idx(VirtQueue *vq)
>>   {
>>       VRingMemoryRegionCaches *caches = atomic_rcu_read(&vq->vring.caches);
>>       hwaddr pa = offsetof(VRingAvail, idx);
>> +    if (!caches) {
>> +        virtio_error(vq->vdev, "Cannot map avail idx");
>> +        return vq->shadow_avail_idx;
>> +    }
>>       vq->shadow_avail_idx = virtio_lduw_phys_cached(vq->vdev, &caches->avail, pa);
>>       return vq->shadow_avail_idx;
>>   }
>> @@ -207,6 +236,10 @@ static inline uint16_t vring_avail_ring(VirtQueue *vq, int i)
>>   {
>>       VRingMemoryRegionCaches *caches = atomic_rcu_read(&vq->vring.caches);
>>       hwaddr pa = offsetof(VRingAvail, ring[i]);
>> +    if (!caches) {
>> +        virtio_error(vq->vdev, "Cannot map avail ring");
>> +        return 0;
>> +    }
>>       return virtio_lduw_phys_cached(vq->vdev, &caches->avail, pa);
>>   }
>>
>> @@ -222,6 +255,10 @@ static inline void vring_used_write(VirtQueue *vq, VRingUsedElem *uelem,
>>   {
>>       VRingMemoryRegionCaches *caches = atomic_rcu_read(&vq->vring.caches);
>>       hwaddr pa = offsetof(VRingUsed, ring[i]);
>> +    if (!caches) {
>> +        virtio_error(vq->vdev, "Cannot map used ring");
>> +        return;
>> +    }
>>       virtio_tswap32s(vq->vdev, &uelem->id);
>>       virtio_tswap32s(vq->vdev, &uelem->len);
>>       address_space_write_cached(&caches->used, pa, uelem, sizeof(VRingUsedElem));
>> @@ -233,6 +270,10 @@ static uint16_t vring_used_idx(VirtQueue *vq)
>>   {
>>       VRingMemoryRegionCaches *caches = atomic_rcu_read(&vq->vring.caches);
>>       hwaddr pa = offsetof(VRingUsed, idx);
>> +    if (!caches) {
>> +        virtio_error(vq->vdev, "Cannot map used ring");
>> +        return 0;
>> +    }
>>       return virtio_lduw_phys_cached(vq->vdev, &caches->used, pa);
>>   }
>>
>> @@ -241,6 +282,10 @@ static inline void vring_used_idx_set(VirtQueue *vq, uint16_t val)
>>   {
>>       VRingMemoryRegionCaches *caches = atomic_rcu_read(&vq->vring.caches);
>>       hwaddr pa = offsetof(VRingUsed, idx);
>> +    if (!caches) {
>> +        virtio_error(vq->vdev, "Cannot map used idx");
>> +        return;
>> +    }
>>       virtio_stw_phys_cached(vq->vdev, &caches->used, pa, val);
>>       address_space_cache_invalidate(&caches->used, pa, sizeof(val));
>>       vq->used_idx = val;
>> @@ -254,6 +299,10 @@ static inline void vring_used_flags_set_bit(VirtQueue *vq, int mask)
>>       hwaddr pa = offsetof(VRingUsed, flags);
>>       uint16_t flags = virtio_lduw_phys_cached(vq->vdev, &caches->used, pa);
>>
>> +    if (!caches) {
>> +        virtio_error(vq->vdev, "Cannot map used flags");
> Regardless of whether using virtio_error here is fine: caches was
> already dereferenced above...
>
>> +        return;
>> +    }
>>       virtio_stw_phys_cached(vdev, &caches->used, pa, flags | mask);
>>       address_space_cache_invalidate(&caches->used, pa, sizeof(flags));
>>   }
>> @@ -266,6 +315,10 @@ static inline void vring_used_flags_unset_bit(VirtQueue *vq, int mask)
>>       hwaddr pa = offsetof(VRingUsed, flags);
>>       uint16_t flags = virtio_lduw_phys_cached(vq->vdev, &caches->used, pa);
>>
>> +    if (!caches) {
>> +        virtio_error(vq->vdev, "Cannot map used flags");
> dito
>
>> +        return;
>> +    }
>>       virtio_stw_phys_cached(vdev, &caches->used, pa, flags & ~mask);
>>       address_space_cache_invalidate(&caches->used, pa, sizeof(flags));
>>   }
>> @@ -280,6 +333,10 @@ static inline void vring_set_avail_event(VirtQueue *vq, uint16_t val)
>>       }
>>
>>       caches = atomic_rcu_read(&vq->vring.caches);
>> +    if (!caches) {
>> +        virtio_error(vq->vdev, "Cannot map avail event");
>> +        return;
>> +    }
>>       pa = offsetof(VRingUsed, ring[vq->vring.num]);
>>       virtio_stw_phys_cached(vq->vdev, &caches->used, pa, val);
>>       address_space_cache_invalidate(&caches->used, pa, sizeof(val));
>> @@ -552,7 +609,7 @@ void virtqueue_get_avail_bytes(VirtQueue *vq, unsigned int *in_bytes,
>>
>>       max = vq->vring.num;
>>       caches = atomic_rcu_read(&vq->vring.caches);
>> -    if (caches->desc.len < max * sizeof(VRingDesc)) {
>> +    if (!caches || caches->desc.len < max * sizeof(VRingDesc)) {
>>           virtio_error(vdev, "Cannot map descriptor ring");
>>           goto err;
>>       }
>> @@ -819,7 +876,7 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz)
>>       i = head;
>>
>>       caches = atomic_rcu_read(&vq->vring.caches);
>> -    if (caches->desc.len < max * sizeof(VRingDesc)) {
>> +    if (!caches || caches->desc.len < max * sizeof(VRingDesc)) {
>>           virtio_error(vdev, "Cannot map descriptor ring");
>>           goto done;
>>       }
>> @@ -1117,6 +1174,15 @@ static enum virtio_device_endian virtio_current_cpu_endian(void)
>>       }
>>   }
>>
>> +static void virtio_virtqueue_reset_region_cache(struct VirtQueue *vq)
>> +{
>> +    VRingMemoryRegionCaches *caches;
>> +
>> +    caches = atomic_read(&vq->vring.caches);
>> +    atomic_set(&vq->vring.caches, NULL);
>> +    virtio_free_region_cache(caches);
> Shouldn't this use rcu to free it? Unconditionally setting caches to
> NULL feels wrong...

Right, will switch to use rcu.

>> +}
>> +
>>   void virtio_reset(void *opaque)
>>   {
>>       VirtIODevice *vdev = opaque;
>> @@ -1157,6 +1223,7 @@ void virtio_reset(void *opaque)
>>           vdev->vq[i].notification = true;
>>           vdev->vq[i].vring.num = vdev->vq[i].vring.num_default;
>>           vdev->vq[i].inuse = 0;
>> +        virtio_virtqueue_reset_region_cache(&vdev->vq[i]);
> ...especially as you call it in a reset context here.
>
>>       }
>>   }
>>
>> @@ -2451,13 +2518,10 @@ static void virtio_device_free_virtqueues(VirtIODevice *vdev)
>>       }
>>
>>       for (i = 0; i < VIRTIO_QUEUE_MAX; i++) {
>> -        VRingMemoryRegionCaches *caches;
>>           if (vdev->vq[i].vring.num == 0) {
>>               break;
>>           }
>> -        caches = atomic_read(&vdev->vq[i].vring.caches);
>> -        atomic_set(&vdev->vq[i].vring.caches, NULL);
>> -        virtio_free_region_cache(caches);
>> +        virtio_virtqueue_reset_region_cache(&vdev->vq[i]);
> OTOH, immediate destruction may still be called for during device
> finalization.
>

Right but to avoid code duplication, use rcu unconditionally should be 
no harm here.

Thanks

>>       }
>>       g_free(vdev->vq);
>>   }

Cornelia Huck March 8, 2017, 9:19 a.m. UTC | #4

On Wed, 8 Mar 2017 11:18:27 +0800
Jason Wang <jasowang@redhat.com> wrote:

> On 2017年03月07日 18:16, Cornelia Huck wrote:
> > On Tue,  7 Mar 2017 16:47:58 +0800
> > Jason Wang <jasowang@redhat.com> wrote:
> >
> >> We don't destroy region cache during reset which can make the maps
> >> of previous driver leaked to a buggy or malicious driver that don't
> >> set vring address before starting to use the device. Fix this by
> >> destroy the region cache during reset and validate it before trying to
> >> use them. While at it, also validate address_space_cache_init() during
> >> virtio_init_region_cache() to make sure we have a correct region
> >> cache.
> >>
> >> Signed-off-by: Jason Wang <jasowang@redhat.com>
> >> ---
> >>   hw/virtio/virtio.c | 88 ++++++++++++++++++++++++++++++++++++++++++++++--------
> >>   1 file changed, 76 insertions(+), 12 deletions(-)

> >> @@ -190,6 +211,10 @@ static inline uint16_t vring_avail_flags(VirtQueue *vq)
> >>   {
> >>       VRingMemoryRegionCaches *caches = atomic_rcu_read(&vq->vring.caches);
> >>       hwaddr pa = offsetof(VRingAvail, flags);
> >> +    if (!caches) {
> >> +        virtio_error(vq->vdev, "Cannot map avail flags");
> > I'm not sure that virtio_error is the right thing here; ending up in
> > this function with !caches indicates an error in our logic.
> 
> Probably not, this can be triggered by buggy guest.

I would think that even a buggy guest should not be able to trigger
accesses to vring structures that have not yet been set up. What am I
missing?

> 
> > An assert
> > might be better (and I hope we can sort out all of those errors exposed
> > by the introduction of region caches for 2.9...)
> 
> I thought we should avoid assert as much as possible in this case. But 
> if you and maintainer want an assert, it's also fine.

My personal rule-of-thumb:
- If it is something that can be triggered by the guest, or it is
something that is easily recovered, set the device to broken.
- If it is something that indicates that we messed up our internal
logic, use an assert.

I think arriving here with !caches indicates the second case; but if
there is a way for a guest to trigger this, setting the device to
broken would certainly be better.

> 
> >
> >> +        return 0;
> >> +    }
> >>       return virtio_lduw_phys_cached(vq->vdev, &caches->avail, pa);
> >>   }
> >>

> >> @@ -1117,6 +1174,15 @@ static enum virtio_device_endian virtio_current_cpu_endian(void)
> >>       }
> >>   }
> >>
> >> +static void virtio_virtqueue_reset_region_cache(struct VirtQueue *vq)
> >> +{
> >> +    VRingMemoryRegionCaches *caches;
> >> +
> >> +    caches = atomic_read(&vq->vring.caches);
> >> +    atomic_set(&vq->vring.caches, NULL);
> >> +    virtio_free_region_cache(caches);
> > Shouldn't this use rcu to free it? Unconditionally setting caches to
> > NULL feels wrong...
> 
> Right, will switch to use rcu.
> 
> >> +}
> >> +
> >>   void virtio_reset(void *opaque)
> >>   {
> >>       VirtIODevice *vdev = opaque;
> >> @@ -1157,6 +1223,7 @@ void virtio_reset(void *opaque)
> >>           vdev->vq[i].notification = true;
> >>           vdev->vq[i].vring.num = vdev->vq[i].vring.num_default;
> >>           vdev->vq[i].inuse = 0;
> >> +        virtio_virtqueue_reset_region_cache(&vdev->vq[i]);
> > ...especially as you call it in a reset context here.
> >
> >>       }
> >>   }
> >>
> >> @@ -2451,13 +2518,10 @@ static void virtio_device_free_virtqueues(VirtIODevice *vdev)
> >>       }
> >>
> >>       for (i = 0; i < VIRTIO_QUEUE_MAX; i++) {
> >> -        VRingMemoryRegionCaches *caches;
> >>           if (vdev->vq[i].vring.num == 0) {
> >>               break;
> >>           }
> >> -        caches = atomic_read(&vdev->vq[i].vring.caches);
> >> -        atomic_set(&vdev->vq[i].vring.caches, NULL);
> >> -        virtio_free_region_cache(caches);
> >> +        virtio_virtqueue_reset_region_cache(&vdev->vq[i]);
> > OTOH, immediate destruction may still be called for during device
> > finalization.
> >
> 
> Right but to avoid code duplication, use rcu unconditionally should be 
> no harm here.

Yes, this should be fine.

Jason Wang March 8, 2017, 9:51 a.m. UTC | #5

On 2017年03月08日 17:19, Cornelia Huck wrote:
> On Wed, 8 Mar 2017 11:18:27 +0800
> Jason Wang <jasowang@redhat.com> wrote:
>
>> On 2017年03月07日 18:16, Cornelia Huck wrote:
>>> On Tue,  7 Mar 2017 16:47:58 +0800
>>> Jason Wang <jasowang@redhat.com> wrote:
>>>
>>>> We don't destroy region cache during reset which can make the maps
>>>> of previous driver leaked to a buggy or malicious driver that don't
>>>> set vring address before starting to use the device. Fix this by
>>>> destroy the region cache during reset and validate it before trying to
>>>> use them. While at it, also validate address_space_cache_init() during
>>>> virtio_init_region_cache() to make sure we have a correct region
>>>> cache.
>>>>
>>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>>> ---
>>>>    hw/virtio/virtio.c | 88 ++++++++++++++++++++++++++++++++++++++++++++++--------
>>>>    1 file changed, 76 insertions(+), 12 deletions(-)
>>>> @@ -190,6 +211,10 @@ static inline uint16_t vring_avail_flags(VirtQueue *vq)
>>>>    {
>>>>        VRingMemoryRegionCaches *caches = atomic_rcu_read(&vq->vring.caches);
>>>>        hwaddr pa = offsetof(VRingAvail, flags);
>>>> +    if (!caches) {
>>>> +        virtio_error(vq->vdev, "Cannot map avail flags");
>>> I'm not sure that virtio_error is the right thing here; ending up in
>>> this function with !caches indicates an error in our logic.
>> Probably not, this can be triggered by buggy guest.
> I would think that even a buggy guest should not be able to trigger
> accesses to vring structures that have not yet been set up. What am I
> missing?

I think this may happen if a driver start the device without setting the 
vring address. In this case, region caches cache the mapping of previous 
driver. But maybe I was wrong.

>
>>> An assert
>>> might be better (and I hope we can sort out all of those errors exposed
>>> by the introduction of region caches for 2.9...)
>> I thought we should avoid assert as much as possible in this case. But
>> if you and maintainer want an assert, it's also fine.
> My personal rule-of-thumb:
> - If it is something that can be triggered by the guest, or it is
> something that is easily recovered, set the device to broken.
> - If it is something that indicates that we messed up our internal
> logic, use an assert.
>
> I think arriving here with !caches indicates the second case; but if
> there is a way for a guest to trigger this, setting the device to
> broken would certainly be better.

Yes, broken seems better.

Thanks

Cornelia Huck March 8, 2017, 10:12 a.m. UTC | #6

On Wed, 8 Mar 2017 17:51:22 +0800
Jason Wang <jasowang@redhat.com> wrote:

> On 2017年03月08日 17:19, Cornelia Huck wrote:
> > On Wed, 8 Mar 2017 11:18:27 +0800
> > Jason Wang <jasowang@redhat.com> wrote:
> >
> >> On 2017年03月07日 18:16, Cornelia Huck wrote:
> >>> On Tue,  7 Mar 2017 16:47:58 +0800
> >>> Jason Wang <jasowang@redhat.com> wrote:
> >>>
> >>>> We don't destroy region cache during reset which can make the maps
> >>>> of previous driver leaked to a buggy or malicious driver that don't
> >>>> set vring address before starting to use the device. Fix this by
> >>>> destroy the region cache during reset and validate it before trying to
> >>>> use them. While at it, also validate address_space_cache_init() during
> >>>> virtio_init_region_cache() to make sure we have a correct region
> >>>> cache.
> >>>>
> >>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
> >>>> ---
> >>>>    hw/virtio/virtio.c | 88 ++++++++++++++++++++++++++++++++++++++++++++++--------
> >>>>    1 file changed, 76 insertions(+), 12 deletions(-)
> >>>> @@ -190,6 +211,10 @@ static inline uint16_t vring_avail_flags(VirtQueue *vq)
> >>>>    {
> >>>>        VRingMemoryRegionCaches *caches = atomic_rcu_read(&vq->vring.caches);
> >>>>        hwaddr pa = offsetof(VRingAvail, flags);
> >>>> +    if (!caches) {
> >>>> +        virtio_error(vq->vdev, "Cannot map avail flags");
> >>> I'm not sure that virtio_error is the right thing here; ending up in
> >>> this function with !caches indicates an error in our logic.
> >> Probably not, this can be triggered by buggy guest.
> > I would think that even a buggy guest should not be able to trigger
> > accesses to vring structures that have not yet been set up. What am I
> > missing?
> 
> I think this may happen if a driver start the device without setting the 
> vring address. In this case, region caches cache the mapping of previous 
> driver. But maybe I was wrong.

So the sequence would be:

- Driver #1 sets up the device, then abandons it without cleaning up
via reset
- Driver #2 uses the device without doing a reset or proper setup

?

I can't quite see why caches would be NULL in that case...

Having reset clean up the caches as well (IOW, the other part of your
patch) should make sure that we always have a consistent view of
descriptors and their caches, I think. The checks for desc != NULL and
friends should catch other driver buggyness.

Jason Wang March 9, 2017, 2:19 a.m. UTC | #7

On 2017年03月08日 18:12, Cornelia Huck wrote:
> On Wed, 8 Mar 2017 17:51:22 +0800
> Jason Wang <jasowang@redhat.com> wrote:
>
>> On 2017年03月08日 17:19, Cornelia Huck wrote:
>>> On Wed, 8 Mar 2017 11:18:27 +0800
>>> Jason Wang <jasowang@redhat.com> wrote:
>>>
>>>> On 2017年03月07日 18:16, Cornelia Huck wrote:
>>>>> On Tue,  7 Mar 2017 16:47:58 +0800
>>>>> Jason Wang <jasowang@redhat.com> wrote:
>>>>>
>>>>>> We don't destroy region cache during reset which can make the maps
>>>>>> of previous driver leaked to a buggy or malicious driver that don't
>>>>>> set vring address before starting to use the device. Fix this by
>>>>>> destroy the region cache during reset and validate it before trying to
>>>>>> use them. While at it, also validate address_space_cache_init() during
>>>>>> virtio_init_region_cache() to make sure we have a correct region
>>>>>> cache.
>>>>>>
>>>>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>>>>> ---
>>>>>>     hw/virtio/virtio.c | 88 ++++++++++++++++++++++++++++++++++++++++++++++--------
>>>>>>     1 file changed, 76 insertions(+), 12 deletions(-)
>>>>>> @@ -190,6 +211,10 @@ static inline uint16_t vring_avail_flags(VirtQueue *vq)
>>>>>>     {
>>>>>>         VRingMemoryRegionCaches *caches = atomic_rcu_read(&vq->vring.caches);
>>>>>>         hwaddr pa = offsetof(VRingAvail, flags);
>>>>>> +    if (!caches) {
>>>>>> +        virtio_error(vq->vdev, "Cannot map avail flags");
>>>>> I'm not sure that virtio_error is the right thing here; ending up in
>>>>> this function with !caches indicates an error in our logic.
>>>> Probably not, this can be triggered by buggy guest.
>>> I would think that even a buggy guest should not be able to trigger
>>> accesses to vring structures that have not yet been set up. What am I
>>> missing?
>> I think this may happen if a driver start the device without setting the
>> vring address. In this case, region caches cache the mapping of previous
>> driver. But maybe I was wrong.
> So the sequence would be:
>
> - Driver #1 sets up the device, then abandons it without cleaning up
> via reset

If driver #1 reset the device in this case, the caches would be NULL.

> - Driver #2 uses the device without doing a reset or proper setup

Without this patch, even if driver #2 do a reset, it can still use the 
old map if it don't set queue pfn.

>
> ?
>
> I can't quite see why caches would be NULL in that case...
>
> Having reset clean up the caches as well (IOW, the other part of your
> patch) should make sure that we always have a consistent view of
> descriptors and their caches,

And prevent it from being leaked to untrusted drivers.

Thanks

> I think. The checks for desc != NULL and
> friends should catch other driver buggyness.
>
>

Cornelia Huck March 9, 2017, 11:07 a.m. UTC | #8

On Thu, 9 Mar 2017 10:19:47 +0800
Jason Wang <jasowang@redhat.com> wrote:

> On 2017年03月08日 18:12, Cornelia Huck wrote:
> > On Wed, 8 Mar 2017 17:51:22 +0800
> > Jason Wang <jasowang@redhat.com> wrote:
> >
> >> On 2017年03月08日 17:19, Cornelia Huck wrote:
> >>> On Wed, 8 Mar 2017 11:18:27 +0800
> >>> Jason Wang <jasowang@redhat.com> wrote:
> >>>
> >>>> On 2017年03月07日 18:16, Cornelia Huck wrote:
> >>>>> On Tue,  7 Mar 2017 16:47:58 +0800
> >>>>> Jason Wang <jasowang@redhat.com> wrote:
> >>>>>
> >>>>>> We don't destroy region cache during reset which can make the maps
> >>>>>> of previous driver leaked to a buggy or malicious driver that don't
> >>>>>> set vring address before starting to use the device. Fix this by
> >>>>>> destroy the region cache during reset and validate it before trying to
> >>>>>> use them. While at it, also validate address_space_cache_init() during
> >>>>>> virtio_init_region_cache() to make sure we have a correct region
> >>>>>> cache.
> >>>>>>
> >>>>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
> >>>>>> ---
> >>>>>>     hw/virtio/virtio.c | 88 ++++++++++++++++++++++++++++++++++++++++++++++--------
> >>>>>>     1 file changed, 76 insertions(+), 12 deletions(-)
> >>>>>> @@ -190,6 +211,10 @@ static inline uint16_t vring_avail_flags(VirtQueue *vq)
> >>>>>>     {
> >>>>>>         VRingMemoryRegionCaches *caches = atomic_rcu_read(&vq->vring.caches);
> >>>>>>         hwaddr pa = offsetof(VRingAvail, flags);
> >>>>>> +    if (!caches) {
> >>>>>> +        virtio_error(vq->vdev, "Cannot map avail flags");
> >>>>> I'm not sure that virtio_error is the right thing here; ending up in
> >>>>> this function with !caches indicates an error in our logic.
> >>>> Probably not, this can be triggered by buggy guest.
> >>> I would think that even a buggy guest should not be able to trigger
> >>> accesses to vring structures that have not yet been set up. What am I
> >>> missing?
> >> I think this may happen if a driver start the device without setting the
> >> vring address. In this case, region caches cache the mapping of previous
> >> driver. But maybe I was wrong.
> > So the sequence would be:
> >
> > - Driver #1 sets up the device, then abandons it without cleaning up
> > via reset
> 
> If driver #1 reset the device in this case, the caches would be NULL.

Hm, how? Without your patch, the queue addresses are reset to 0 in that
case but the cache is not cleaned up.

> 
> > - Driver #2 uses the device without doing a reset or proper setup
> 
> Without this patch, even if driver #2 do a reset, it can still use the 
> old map if it don't set queue pfn.

Yes, the cleanup-on-reset is definetly needed.

> 
> >
> > ?
> >
> > I can't quite see why caches would be NULL in that case...
> >
> > Having reset clean up the caches as well (IOW, the other part of your
> > patch) should make sure that we always have a consistent view of
> > descriptors and their caches,
> 
> And prevent it from being leaked to untrusted drivers.

And that as well, agreed.

I'm still not sure why the checks for !caches help, though...

Paolo Bonzini March 9, 2017, 11:12 a.m. UTC | #9

On 09/03/2017 12:07, Cornelia Huck wrote:
>>> - Driver #2 uses the device without doing a reset or proper setup
>> Without this patch, even if driver #2 do a reset, it can still use the 
>> old map if it don't set queue pfn.
> 
> Yes, the cleanup-on-reset is definetly needed.

It is good to have for defensiveness, but it would still cause a
segfault so we should also add the checks on vq->vring.desc throughout.

Paolo

Cornelia Huck March 9, 2017, 11:38 a.m. UTC | #10

On Thu, 9 Mar 2017 12:12:00 +0100
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 09/03/2017 12:07, Cornelia Huck wrote:
> >>> - Driver #2 uses the device without doing a reset or proper setup
> >> Without this patch, even if driver #2 do a reset, it can still use the 
> >> old map if it don't set queue pfn.
> > 
> > Yes, the cleanup-on-reset is definetly needed.
> 
> It is good to have for defensiveness, but it would still cause a
> segfault so we should also add the checks on vq->vring.desc throughout.

Agreed.

Jason Wang March 10, 2017, 10:57 a.m. UTC | #11

On 2017年03月09日 19:07, Cornelia Huck wrote:
> On Thu, 9 Mar 2017 10:19:47 +0800
> Jason Wang<jasowang@redhat.com>  wrote:
>
>> On 2017年03月08日 18:12, Cornelia Huck wrote:
>>> On Wed, 8 Mar 2017 17:51:22 +0800
>>> Jason Wang<jasowang@redhat.com>  wrote:
>>>
>>>> On 2017年03月08日 17:19, Cornelia Huck wrote:
>>>>> On Wed, 8 Mar 2017 11:18:27 +0800
>>>>> Jason Wang<jasowang@redhat.com>  wrote:
>>>>>
>>>>>> On 2017年03月07日 18:16, Cornelia Huck wrote:
>>>>>>> On Tue,  7 Mar 2017 16:47:58 +0800
>>>>>>> Jason Wang<jasowang@redhat.com>  wrote:
>>>>>>>
>>>>>>>> We don't destroy region cache during reset which can make the maps
>>>>>>>> of previous driver leaked to a buggy or malicious driver that don't
>>>>>>>> set vring address before starting to use the device. Fix this by
>>>>>>>> destroy the region cache during reset and validate it before trying to
>>>>>>>> use them. While at it, also validate address_space_cache_init() during
>>>>>>>> virtio_init_region_cache() to make sure we have a correct region
>>>>>>>> cache.
>>>>>>>>
>>>>>>>> Signed-off-by: Jason Wang<jasowang@redhat.com>
>>>>>>>> ---
>>>>>>>>      hw/virtio/virtio.c | 88 ++++++++++++++++++++++++++++++++++++++++++++++--------
>>>>>>>>      1 file changed, 76 insertions(+), 12 deletions(-)
>>>>>>>> @@ -190,6 +211,10 @@ static inline uint16_t vring_avail_flags(VirtQueue *vq)
>>>>>>>>      {
>>>>>>>>          VRingMemoryRegionCaches *caches = atomic_rcu_read(&vq->vring.caches);
>>>>>>>>          hwaddr pa = offsetof(VRingAvail, flags);
>>>>>>>> +    if (!caches) {
>>>>>>>> +        virtio_error(vq->vdev, "Cannot map avail flags");
>>>>>>> I'm not sure that virtio_error is the right thing here; ending up in
>>>>>>> this function with !caches indicates an error in our logic.
>>>>>> Probably not, this can be triggered by buggy guest.
>>>>> I would think that even a buggy guest should not be able to trigger
>>>>> accesses to vring structures that have not yet been set up. What am I
>>>>> missing?
>>>> I think this may happen if a driver start the device without setting the
>>>> vring address. In this case, region caches cache the mapping of previous
>>>> driver. But maybe I was wrong.
>>> So the sequence would be:
>>>
>>> - Driver #1 sets up the device, then abandons it without cleaning up
>>> via reset
>> If driver #1 reset the device in this case, the caches would be NULL.
> Hm, how? Without your patch, the queue addresses are reset to 0 in that
> case but the cache is not cleaned up.
>

Yes, but with this patch, caches will be set to NULL during reset. So 
the following vq access without seting queue pfn may hit NULL.

Thanks

virtio: destroy region cache during reset

Commit Message

Comments

Patch