Patchwork [PATCHv3,3/3] virtio: order index/descriptor reads

login
register
mail settings
Submitter Michael S. Tsirkin
Date April 24, 2012, 3:33 p.m.
Message ID <5f860186390d1286efa9318388aaef22d3bc3e05.1335281438.git.mst@redhat.com>
Download mbox | patch
Permalink /patch/154720/
State New
Headers show

Comments

Michael S. Tsirkin - April 24, 2012, 3:33 p.m.
virtio has the equivalent of:

	if (vq->last_avail_index != vring_avail_idx(vq)) {
		read descriptor head at vq->last_avail_index;
	}

In theory, processor can reorder descriptor head
read to happen speculatively before the index read.
this would trigger the following race:

	host descriptor head read <- reads invalid head from ring
		guest writes valid descriptor head
		guest writes avail index
	host avail index read <- observes valid index

as a result host will use an invalid head value.
This was not observed in the field by me but after
the experience with the previous two races
I think it is prudent to address this theoretical race condition.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 hw/virtio.c    |    5 +++++
 qemu-barrier.h |   12 +++++++++++-
 2 files changed, 16 insertions(+), 1 deletions(-)

Patch

diff --git a/hw/virtio.c b/hw/virtio.c
index def0bf1..c081e1b 100644
--- a/hw/virtio.c
+++ b/hw/virtio.c
@@ -287,6 +287,11 @@  static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx)
                      idx, vring_avail_idx(vq));
         exit(1);
     }
+    /* On success, callers read a descriptor at vq->last_avail_idx.
+     * Make sure descriptor read does not bypass avail index read. */
+    if (num_heads) {
+        smp_rmb();
+    }
 
     return num_heads;
 }
diff --git a/qemu-barrier.h b/qemu-barrier.h
index f0b842e..c89d312 100644
--- a/qemu-barrier.h
+++ b/qemu-barrier.h
@@ -24,10 +24,13 @@ 
 #define smp_mb() asm volatile("lock; addl $0,0(%%esp) " ::: "memory")
 #endif
 
+#define smp_rmb() smp_mb()
+
 #elif defined(__x86_64__)
 
 #define smp_wmb()   barrier()
 #define smp_mb() asm volatile("mfence" ::: "memory")
+#define smp_rmb() asm volatile("lfence" ::: "memory")
 
 #elif defined(_ARCH_PPC)
 
@@ -39,16 +42,23 @@ 
 #define smp_wmb()   asm volatile("eieio" ::: "memory")
 #define smp_mb()   asm volatile("sync" ::: "memory")
 
+#if defined(__powerpc64__)
+#define smp_rmb()   asm volatile("lwsync" ::: "memory")
+#else
+#define smp_rmb()   asm volatile("sync" ::: "memory")
+#endif
+
 #else
 
 /*
  * For (host) platforms we don't have explicit barrier definitions
  * for, we use the gcc __sync_synchronize() primitive to generate a
  * full barrier.  This should be safe on all platforms, though it may
- * be overkill for wmb().
+ * be overkill for wmb() and rmb().
  */
 #define smp_wmb()   __sync_synchronize()
 #define smp_mb()   __sync_synchronize()
+#define smp_rmb()   __sync_synchronize()
 
 #endif