From patchwork Mon Mar 21 10:57:18 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fam Zheng X-Patchwork-Id: 600060 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3qTCTj1Cgwz9s2Q for ; Mon, 21 Mar 2016 21:57:45 +1100 (AEDT) Received: from localhost ([::1]:56911 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ahxWp-0002Ct-CM for incoming@patchwork.ozlabs.org; Mon, 21 Mar 2016 06:57:43 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:47653) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ahxWZ-0001vC-JT for qemu-devel@nongnu.org; Mon, 21 Mar 2016 06:57:28 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ahxWY-0007fX-M9 for qemu-devel@nongnu.org; Mon, 21 Mar 2016 06:57:27 -0400 Received: from mx1.redhat.com ([209.132.183.28]:32970) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ahxWS-0007fD-HF; Mon, 21 Mar 2016 06:57:20 -0400 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by mx1.redhat.com (Postfix) with ESMTPS id ECF60C00DDE1; Mon, 21 Mar 2016 10:57:19 +0000 (UTC) Received: from localhost (dhcp-14-151.nay.redhat.com [10.66.14.151]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u2LAvIsK006525; Mon, 21 Mar 2016 06:57:19 -0400 Date: Mon, 21 Mar 2016 18:57:18 +0800 From: Fam Zheng To: tu bo Message-ID: <20160321105718.GA7710@ad.usersys.redhat.com> References: <1458123018-18651-1-git-send-email-famz@redhat.com> <56E9355A.5070700@redhat.com> <56E93A22.1080102@de.ibm.com> <56E93ECE.10103@redhat.com> <56E9425C.8030201@de.ibm.com> <56E957AD.2050005@redhat.com> <56E961EA.4090908@de.ibm.com> <56E9638B.5090204@redhat.com> <20160317003906.GA23821@ad.usersys.redhat.com> <56EA8EEE.2020801@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <56EA8EEE.2020801@linux.vnet.ibm.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-Scanned-By: MIMEDefang 2.68 on 10.5.11.22 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 209.132.183.28 Cc: Kevin Wolf , qemu-block@nongnu.org, "Michael S. Tsirkin" , qemu-devel@nongnu.org, Christian Borntraeger , Stefan Hajnoczi , cornelia.huck@de.ibm.com, Paolo Bonzini Subject: Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org On Thu, 03/17 19:03, tu bo wrote: > > On 03/17/2016 08:39 AM, Fam Zheng wrote: > >On Wed, 03/16 14:45, Paolo Bonzini wrote: > >> > >> > >>On 16/03/2016 14:38, Christian Borntraeger wrote: > >>>>If you just remove the calls to virtio_queue_host_notifier_read, here > >>>>and in virtio_queue_aio_set_host_notifier_fd_handler, does it work > >>>>(keeping patches 2-4 in)? > >>> > >>>With these changes and patch 2-4 it does no longer locks up. > >>>I keep it running some hour to check if a crash happens. > >>> > >>>Tu Bo, your setup is currently better suited for reproducing. Can you also check? > >> > >>Great, I'll prepare a patch to virtio then sketching the solution that > >>Conny agreed with. > >> > >>While Fam and I agreed that patch 1 is not required, I'm not sure if the > >>mutex is necessary in the end. > > > >If we can fix this from the virtio_queue_host_notifier_read side, the mutex/BH > >are not necessary; but OTOH the mutex does catch such bugs, so maybe it's good > >to have it. I'm not sure about the BH. > > > >And on a hindsight I realize we don't want patches 2-3 too. Actually the > >begin/end pair won't work as expected because of the blk_set_aio_context. > > > >Let's hold on this series. > > > >> > >>So if Tu Bo can check without the virtio_queue_host_notifier_read calls, > >>and both with/without Fam's patches, it would be great. > > > >Tu Bo, only with/withoug patch 4, if you want to check. Sorry for the noise. > > > 1. without the virtio_queue_host_notifier_read calls, without patch 4 > > crash happens very often, > > (gdb) bt > #0 bdrv_co_do_rw (opaque=0x0) at block/io.c:2172 > #1 0x000002aa165da37e in coroutine_trampoline (i0=, > i1=1812051552) at util/coroutine-ucontext.c:79 > #2 0x000003ff7dd5150a in __makecontext_ret () from /lib64/libc.so.6 > > > 2. without the virtio_queue_host_notifier_read calls, with patch 4 > > crash happens very often, > > (gdb) bt > #0 bdrv_co_do_rw (opaque=0x0) at block/io.c:2172 > #1 0x000002aa39dda43e in coroutine_trampoline (i0=, > i1=-1677715600) at util/coroutine-ucontext.c:79 > #2 0x000003ffab6d150a in __makecontext_ret () from /lib64/libc.so.6 > > Tu Bo, Could you help test this patch (on top of master, without patch 4)? diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index 08275a9..47f8043 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -1098,7 +1098,14 @@ void virtio_queue_notify_vq(VirtQueue *vq) void virtio_queue_notify(VirtIODevice *vdev, int n) { - virtio_queue_notify_vq(&vdev->vq[n]); + VirtQueue *vq = &vdev->vq[n]; + EventNotifier *n; + n = virtio_queue_get_host_notifier(vq); + if (n) { + event_notifier_set(n); + } else { + virtio_queue_notify_vq(vq); + } } uint16_t virtio_queue_vector(VirtIODevice *vdev, int n)