From patchwork Fri Dec 24 11:22:00 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Yoshiaki Tamura X-Patchwork-Id: 76629 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [199.232.76.165]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 11BC1B70B8 for ; Fri, 24 Dec 2010 22:23:13 +1100 (EST) Received: from localhost ([127.0.0.1]:37530 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PW5jY-0007CW-4f for incoming@patchwork.ozlabs.org; Fri, 24 Dec 2010 06:22:52 -0500 Received: from [140.186.70.92] (port=58625 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PW5im-0007CQ-5G for qemu-devel@nongnu.org; Fri, 24 Dec 2010 06:22:05 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PW5ik-0007md-GK for qemu-devel@nongnu.org; Fri, 24 Dec 2010 06:22:04 -0500 Received: from mail-wy0-f173.google.com ([74.125.82.173]:47683) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PW5ik-0007mS-7v for qemu-devel@nongnu.org; Fri, 24 Dec 2010 06:22:02 -0500 Received: by wyg36 with SMTP id 36so7490342wyg.4 for ; Fri, 24 Dec 2010 03:22:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:sender:received :in-reply-to:references:date:x-google-sender-auth:message-id:subject :from:to:cc:content-type:content-transfer-encoding; bh=JsmizGUqVSaLaLhCMZdqIDrGmhjlw+4i4/ZZmYXZSz8=; b=OxF0Q6/kxh1DNVpYfcgrbsewoFa+plA6ihfoX4a0HjGhj1IGJQ9vPMAfjjuyT+YGqO Qp4CJkuqnYeOcq1zD09SSOI8bzqJXHjYpjp0oQmT6kQVPmXo83h/Qn/r2dqFlTC1JC7u Dmio9/f6TwoJzdFXDiu1sPFspcGuCi+T0tgeo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=nhuv2xbco/2cNKUCjT/Gi2sGKa+0d1Q0NyD3Us/YKUw+R3vFvVLkJTlWkzofaQFYNQ iu+G4pvAoe0iNf0SpJpeD/BF/sph90V+Bn59pmrzXfNjFwfvbpBaoKmNlK0DZwl0xTUa LEAvGQjnzpaw4Jbunv77rGbrg8q6onQCPF5PU= MIME-Version: 1.0 Received: by 10.216.162.84 with SMTP id x62mr10094807wek.106.1293189720149; Fri, 24 Dec 2010 03:22:00 -0800 (PST) Received: by 10.216.10.3 with HTTP; Fri, 24 Dec 2010 03:22:00 -0800 (PST) In-Reply-To: <20101224094416.GB23271@redhat.com> References: <1293160708-30881-1-git-send-email-tamura.yoshiaki@lab.ntt.co.jp> <1293160708-30881-7-git-send-email-tamura.yoshiaki@lab.ntt.co.jp> <20101224094416.GB23271@redhat.com> Date: Fri, 24 Dec 2010 20:22:00 +0900 X-Google-Sender-Auth: iYyjIdSW1h9fexiSCkLBJLGpkZ0 Message-ID: From: Yoshiaki Tamura To: "Michael S. Tsirkin" X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) Cc: kwolf@redhat.com, aliguori@us.ibm.com, dlaor@redhat.com, ananth@in.ibm.com, kvm@vger.kernel.org, ohmura.kei@lab.ntt.co.jp, mtosatti@redhat.com, qemu-devel@nongnu.org, vatsa@linux.vnet.ibm.com, avi@redhat.com, psuriset@linux.vnet.ibm.com, stefanha@linux.vnet.ibm.com Subject: [Qemu-devel] Re: [PATCH 06/19] virtio: update last_avail_idx when inuse is decreased. X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org 2010/12/24 Michael S. Tsirkin : > On Fri, Dec 24, 2010 at 12:18:15PM +0900, Yoshiaki Tamura wrote: >> virtio save/load is currently sending last_avail_idx, but inuse isn't. >> This causes inconsistent state when using Kemari which replays >> outstanding requests on the secondary.  By letting last_avail_idx to >> be updated after inuse is decreased, it would be possible to replay >> the outstanding requests.  Noth that live migration shouldn't be >> affected because it waits until flushing all requests.  Also in >> conjunction with event-tap, requests inversion should be avoided. >> >> Signed-off-by: Yoshiaki Tamura > > I think I understood the request inversion. My question now is, > event-tap transfers inuse events as well, wont the same > request be repeated twice? > >> --- >>  hw/virtio.c |    8 +++++++- >>  1 files changed, 7 insertions(+), 1 deletions(-) >> >> diff --git a/hw/virtio.c b/hw/virtio.c >> index 07dbf86..f915c46 100644 >> --- a/hw/virtio.c >> +++ b/hw/virtio.c >> @@ -72,7 +72,7 @@ struct VirtQueue >>      VRing vring; >>      target_phys_addr_t pa; >>      uint16_t last_avail_idx; >> -    int inuse; >> +    uint16_t inuse; >>      uint16_t vector; >>      void (*handle_output)(VirtIODevice *vdev, VirtQueue *vq); >>      VirtIODevice *vdev; >> @@ -671,6 +671,7 @@ void virtio_save(VirtIODevice *vdev, QEMUFile *f) >>          qemu_put_be32(f, vdev->vq[i].vring.num); >>          qemu_put_be64(f, vdev->vq[i].pa); >>          qemu_put_be16s(f, &vdev->vq[i].last_avail_idx); >> +        qemu_put_be16s(f, &vdev->vq[i].inuse); >>          if (vdev->binding->save_queue) >>              vdev->binding->save_queue(vdev->binding_opaque, i, f); >>      } >> @@ -710,6 +711,11 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f) >>          vdev->vq[i].vring.num = qemu_get_be32(f); >>          vdev->vq[i].pa = qemu_get_be64(f); >>          qemu_get_be16s(f, &vdev->vq[i].last_avail_idx); >> +        qemu_get_be16s(f, &vdev->vq[i].inuse); >> + >> +        /* revert last_avail_idx if there are outstanding emulation. */ > > if there are outstanding emulation -> if requests > are outstanding in event-tap? > >> +        vdev->vq[i].last_avail_idx -= vdev->vq[i].inuse; >> +        vdev->vq[i].inuse = 0; >> > > I don't understand it, if this is all we do we can equivalently > decrement on the sender side and avoid breaking migration compatibility? It seems I sent the old patch... I'm really sorry. Currently I'm taking the approach to update last_avai_idx later. Decreasing looks scary to me if the guest already knows about it. commit 8ac6ba51cc558b3bfcac7a5814d92f275ee874e9 Author: Yoshiaki Tamura Date: Mon May 17 10:36:14 2010 +0900 virtio: update last_avail_idx when inuse is decreased. virtio save/load is currently sending last_avail_idx, but inuse isn't. This causes inconsistent state when using Kemari which replays outstanding requests on the secondary. By letting last_avail_idx to be updated after inuse is decreased, it would be possible to replay the outstanding requests. Noth that live migration shouldn't be affected because it waits until flushing all requests. Also in conjunction with event-tap, requests inversion should be avoided. Signed-off-by: Yoshiaki Tamura > >>          if (vdev->vq[i].pa) { >>              uint16_t nheads; >> -- >> 1.7.1.2 > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at  http://vger.kernel.org/majordomo-info.html > diff --git a/hw/virtio.c b/hw/virtio.c index 07dbf86..b1586da 100644 --- a/hw/virtio.c +++ b/hw/virtio.c @@ -198,7 +198,7 @@ int virtio_queue_ready(VirtQueue *vq) int virtio_queue_empty(VirtQueue *vq) { - return vring_avail_idx(vq) == vq->last_avail_idx; + return vring_avail_idx(vq) == vq->last_avail_idx + vq->inuse; } void virtqueue_fill(VirtQueue *vq, const VirtQueueElement *elem, @@ -238,6 +238,7 @@ void virtqueue_flush(VirtQueue *vq, unsigned int count) wmb(); trace_virtqueue_flush(vq, count); vring_used_idx_increment(vq, count); + vq->last_avail_idx += count; vq->inuse -= count; } @@ -306,7 +307,7 @@ int virtqueue_avail_bytes(VirtQueue *vq, int in_bytes, int o unsigned int idx; int total_bufs, in_total, out_total; - idx = vq->last_avail_idx; + idx = vq->last_avail_idx + vq->inuse; total_bufs = in_total = out_total = 0; while (virtqueue_num_heads(vq, idx)) { @@ -386,7 +387,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem) unsigned int i, head, max; target_phys_addr_t desc_pa = vq->vring.desc; - if (!virtqueue_num_heads(vq, vq->last_avail_idx)) + if (!virtqueue_num_heads(vq, vq->last_avail_idx + vq->inuse)) return 0; /* When we start there are none of either input nor output. */ @@ -394,7 +395,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem) max = vq->vring.num; - i = head = virtqueue_get_head(vq, vq->last_avail_idx++); + i = head = virtqueue_get_head(vq, vq->last_avail_idx + vq->inuse); if (vring_desc_flags(desc_pa, i) & VRING_DESC_F_INDIRECT) { if (vring_desc_len(desc_pa, i) % sizeof(VRingDesc)) { @@ -626,7 +627,7 @@ void virtio_notify(VirtIODevice *vdev, VirtQueue *vq) /* Always notify when queue is empty (when feature acknowledge) */ if ((vring_avail_flags(vq) & VRING_AVAIL_F_NO_INTERRUPT) && (!(vdev->guest_features & (1 << VIRTIO_F_NOTIFY_ON_EMPTY)) || - (vq->inuse || vring_avail_idx(vq) != vq->last_avail_idx))) + (vq->inuse || vring_avail_idx(vq) != vq->last_avail_idx + vq->inuse))) return; trace_virtio_notify(vdev, vq);